EXHIBIT 13 



(19) 



J 





(12) 



(43) Date of publication: 

12.04.2000 Bulletin 2000/15 



Europaisches Patentamt 
European Patent Office 
Office europeen des brevets EP 0 992 51 1 A1 

EUROPEAN PATENT APPLICATION 

(51) int. CI. 7 : C07H 21/00, C12Q 1/68 



(21 ) Application number: 991 1 3790.2 

(22) Date of filing: 23.01.1997 



(84) Designated Contracting States: 

AT BE CH DE DK ES Fl FR GB GR IE IT LI LU MC 
NL PT SE 

(30) Priority: 23.01.1996 US 589260 
23.01.1996 US 10462 P 



(62) Document number(s) of the earlier application(s) 
accordance with Art. 76 EPC: 
97905634.8 / 0 868 535 

(71) Applicant: Rapigene, Inc. 
Bothell, Washington 98021 (US) 

(72) Inventors: 

• Howbert, J. Jeffry 
Washington 98005 (US) 



in 



Mulligan, John T. 
Washington 98105 (US) 
Tabone, John C. 
Washington 98011 (US) 
Van Ness, Jeffrey 
Washington 981 25 (US) 

(74) Representative: 

Gowshall, Jonathan Vallance 
FORRESTER & BOEHMERT 
Franz-Joseph-Strasse 38 
80801 Munchen (DE) 

Remarks: 

This application was filed on 14 - 07 - 1 999 as a 
divisional application to the application mentioned 
under INID code 62. 




(57) Methods and compounds, including composi- 
tions therefrom, are provided for determining the 
sequence of nucleic acid molecules.. The methods per- 
mit the determination of multiple nucleic acid sequences 
simultaneously. The compounds are used as tags to 
generate tagged nucleic acid fragments which are com- 
plementary to a selected target nucleic acid molecule. 
Each tag is correlative with a particular nucleotide and 
in a preferred embodiment, is detectable by mass spec- 
trometry. Following separation of the tagged fragments 
by sequential length, the tags are cleaved from the 
tagged fragments. In a preferred embodiment, the tags 
are detected by mass spectrometry and the sequence 
of the nudeic acid molecule is determined therefrom 
The individual steps of the methods can be used in 
automated format, e.g., by the incorporation into sys- 
tems. 



CM 
CD 



n Applicants: Jingyue Ju et al. 

LU Serial No.: 10/735,081 

Printed by Xerox (UK) Business Services Filed: December 1 1 , 2003 

2.16.7 ,HR 3 )*.6 Exhibit 13 



.) 



EP 0 992 511 A1 

Description 
TECHNICAL FIELD 



[0C01] The present invention relates generally to methods and compositions for HPt wm inin„ *, 
nucleic acid molecules, and more specifically to methods ^nZ^oT^ „ determining the sequence of 
nucleic acid sequences simultaneously * COmposrt,ons wh,ch allow *• determination of multiple 



nucleic acid sequences simultaneously. 
BACKGROUND OF THE INVENTION 



20 



25 



30 



35 



40 



45 



50 



55 



[0002] Deo^rbonucle.caad(DNA,sequencingisoneofmebasictechniquesofbiolooy It is at the heart of 
ular biology and plays a rapidly expanding role in the rest of bioloav The Hum™ q«£?'p ^ eC " 
effort to read the entire human genetic code It is the lames J™2KLln ? . Pr0,eCt ,S 8 ««**«ional 
to have a major impact on medicine. ThVdeveCe^cSS S " haS a ' ready ^ 

success of this project. Indeed, a substantia. eZ tas be^eTb ^SSZSSS^"^? Wi " 

Genome Project to improve sequencing technology, howeve^iSa ZZ^JZl^J™?* " Human 
and Waterston. Nature 376:1 75. 1995) substantial impact on current practices (Sulston 

[0C03J In the past two decades, determination and analysis of nucleic acid s*«. ,«,« 

blocks of biological research. This, along with new investjo^iln^o^ .1 s ^ u « n c e ^s formed one of the building 
study genes and gene product in orde t T^Ze^rT^ ■ otanB * to 

therapeutics and diagnostics. understand the function of these genes, as well as to develop new 

[0004] Two different DNA sequencing methodologies that were developed in 1977 a « • 
Briefly, the enzymatic method described by Sanger (Proc. A/art Ac^dZf 7^)7^3 '977! ^ 
oxy-terminators. invoh/es the synthesis of a DNA <5tranH frnm a " . JV ,:>4e3, 1977 ) utilizes dide- 

Sanger methc* o, s^n^^T^Z ^^^^^^^^^^ 

incorporated into the DNA chain, me absence 5* £. tSSSSSSSTSJ f " 9ati0n - ^ 3 ddNTP fe 
bond and the DNA fragment is terminated wZC^S^^'T * f P" 0 ^^ 
and Gilbert method (Maxam and Gilbert. Proc Nati Acac S£ 5^ *! template DNA " 1,16 Maxam 
method of the original DNA (in both cases the DNA mS fbe 5m Bom ■Stl!? * ChemiCaI de 9 rad **>" 

that begin f mm a particular point and terminate in eve^s i s ^h^^^f^iT^ 01 ,ra9men,S 
The termination of each fragment is dependent onftl EnofS?!, *S m8rt mat 18 to sequenced. 
The DNA fragments are separated by 5K5^£55i^^ ^ ° ri9ina ' ° NA fraament - 

on^then^SrenTpprcTh^ 1988), 
DNA temp,ates (samples^ ProcSs^ * a number of 

end of the processing, the DNA molecules of interest are -f sequence '"formation at the 

tagged DNA molecules are poo,ed. -SaTc^ 

pooled samples, the DNA is transferred to a solid suDDori t and then ^JZ^T electrophoresis of the 

labeled oligonucleotides. These membranes IT^KSSSES! * SerieS 01 "P* 1 * 

ducing. in each set of probing, -u***^'^^ ZaSJSS^r ^ 
reaction and gel yields a quantity of data eaui V ai*nt tr> «k* ^ * sequencing methods. Thus each 

me nun*.. « „4e S used If ^JSSSTL? ~™«<°™' ™«*« and 9« s muBipKec „ 

men, o, an^.on an^e* Jr^^KS™^" 01 ;*^ 
run on a single lane. Each dideoxy-terminator ream™ « ro ™L~^ u aideoxy-termmator reactions to be 



2 



CD (X 



J 

EP0 992 511 A1 



25 



30 



35 



40 



45 



50 



therefore, about 6000 nt can be sequenced per hour per sequencer 

Fourier transform (FT) mass spectrometry (MS) teilTII^ electrospray .omzation (ESQ and 
' 5 charged ions measured at high (1(£ reso^a ™ r ~ SenS '? Ve - D,ssociatjo " P^cts of mulUply- 

sequence in less than one rnirSeonl^^ bacKbone deavages proving the full 

For molecular weight measurements EsSms has been ELSIES! 1 SoC " 1 76:4893 - 1 994 >- 

22:3895. 1994). ESI/FTMS ap^c TLTva^e «nS.^t T I 9 "^* ^ * ^ *** 

of DNA fragments that vary in length by one nudeotfde. TlTKS^IiSEJ S ? T*' to 3 
the original primer, a replicated sequence of part of the DNA^^^lS^^^^ ""'f^!? each contain 
DNA molecules is produced that contain the orimer a^ditter in iln^' t the ^ deox V terminator. That is. a set of 
[0010] Brennen et a!, (flu -S^Z^I^I^S S^ 0 *"^ 0 " 6 nuctaoB * reSidue - 
stable isotopes of sulfur as DNA labels Sat enable ^^SZ'nJL t > S descr,bed me « 1 «is to use the four 

mk-jet printer head, and ten subjected to compete combi^inn in „ " P ,co "ters are obtained by a modified 
of the labeled DNA to SO a . whic ^^e^o 3^^! ^ T 5 PrOC6SS Qxidi2es the *i°Pn°sphates 
SO mass unit represented is JSgSSZ&R S^OTTaX? SETS? ^ 
olution of the DNA fragments as thev emerae from th« roi, ,mo hLI w ? ■ T Ma,nten ance of the res- 

the mass spectrometer is co^^Z^^^i^^^^l 00 ?"? SU " iCient,y sma " fractons - Because 
electrophoresis. This processes unfo^natl^S^^ ifbt^T^H^ 6 ° f 3n3,ySiS ' 5 by the rate of 

Two other basic constraints a^^^^^^l^S^^"^ 9as and has no » been commercialized, 
bariccontamin^ 

-nr^ 

^°ir' c ^r;r n ; hen 363 is ^ 100 fow » ^ - ^^nSr^Sflr - ^ to obtein at 

SSr. an°a^ "'a" SSSSSZS^ ^ ^ ™ 3 " «*~ 

thesizes and labels multiple iilmSSS y ^SS!SSTS V*"™ The SyntheSi2er s ^ 

DNA on membranes. The detector identifies ^Si«Sn -Stl^ ^ oh 9° mere »• «ed to probe immobilized 
which constructs a sequence a^e^ to 3 Central «"W 

iterate process, a DNA sequence e^^l^STS^Sr - " Synth6S ' S * ^ 3 " 

plate. A primer is hybridized to mtSS^SZSSSZl l^n^l " hyWdbBd * 3 nudeic acid tem " 

to the primed template in the presence ^ * 3,80 hybridized 

primer. Modifications pern* the dJEfifiS £nS^ to the 

get nucleotide residues in the nucleic acid template that are spaced aHnte ^ Z 3! ! me "? ers , of a {,rst S6t <* ter " 
•abe.ed .gated product Is formed wherein *e posj XgSZZ E^^^SSS^ 



3 



TO 



EP 0 992 511 A1 

Serge, reaction is not used). DNA is donWVnto tESL JS2 'th. TJ2£ <•» 

^^^^ 

smatt numb* =, uas wNcn c„ be resoSS, <^£«3^E^^ " * *" 

ebo»e uses ewe tags, but Ih. use and detaoion it these £ IsXdSi P °°* n9 8>aem * eussM 

[0016] The present invention discloses novel cornno^tinnc ar v< «^*v™4_- ^- , 



J5 [0016] 

acid m 

other related advantages 
SUMMARY OF THE INVENTION 



20 



determining the sequence of a nucleic acid molecule TtT^^^t hi 7 ^T. mS,hOCte are ? rovided *» 
acid fragments which are complementary to a seSed tame, rSL Sf^, , ^ ( * 9 eneratin 9 *«ed nucleic 
» particular nucleotide and detectable by rWffuori tSt Somtw L^f" ' Wh6rem 3 taQ is Correlative with a 
ments by sequential length; (c) cleaving™ Somle^^lS 6 ^?.?^ W "P-"** the tagged frag- 
cent spectrometry or poSiometry and therSm dTte^S detecting the tags by noMkior^ 

30 [001 8] In another aspect, the invention provides a compound of the formula: 



T^-L-X 



40 



45 



50 



55 



spectrometry and is se.ected^om 155 ^SS^^JSl^ *Z ^ * SUbj6C,ed to ™» 
from hydroxy., amino, thiol, carboxylic acid SSS ^SvZ- l^L* 15 3 ,UnCti0na ' 9rou P se,ec »« 1 
of the group towa^ coupHng with o'ther mll^ 

the nucleic acid fragment; with the provisos that the compound is not ^Zi l T, ,1 * than the 3 end * 

mass of less than 250 daltons compound is not bonded to a sohd support through X nor has a 

hydrogen and fluoride, and SSTS^SS^^Z T*™^ COmPriSin9 * ,6ast ° ne of 

organic group which allows a ^^^3'^ ^ Phosphorus and iodine; L is an 
^-containing moiety comprises a functional ™^£h! C ' eav ^ from * e 'emamder of the compound, wherein the 
subjected to mass ^ecZetry iS^S^T^SiSrS^*^! ^ ^ ** <he ""^ * 
nudeic acid fragmer, wherein conju^l^ 

two compounds have either the same T™ or the same MOI : and wnerein ™> 

hydrogen and fluoride, and optional aLS mSSS^, n*vTn Sp f Clrom< ** «*"Prising carbon, at least one of 
organic group which allows a T*4wESa mS* t ^TITI?, T"" SU,fUr - P hos P horus ^d iodine; L is an 
^-containing moiety comprises aTunrtiona. 7S^!^ ^ ^S?Z « *» C ° mp0und - wherein 
is subjected to mass spectrometry arS 2SSSS^2f2?f^ 9 '° niZed ° harge Sta,e When 1,16 compound 

[0021] ,n another aspect, the invention provides for a composition comprising M^V^'^pc^ 



4 



10 



15 



20 



25 



30 



35 



7 

EP0 992 511 A1 



€ 



each set of compounds havng the formula T--L-MOI, wherein. T™ is an organic group detectable by mass spectrom- 

to'SZST 0 T° ne ° 1 hydrOQen and f,UOride ' «* optional atoms ^ oxygen . n^eTS- 

fur. Phosphorus and iodine; L ,s an organic group which allows a T"*-containing moiety to be cleaved IrTm tte 
remainder of the compound, wherein the ^-containing moiety comprises a functional groJp whkfh shorts a SnSe 
ionized charge state when the compound is subjected to mass spectrometry and is selected from 

Se Jl 3 ?^ T M °' w nUC ' eiC 3Cid ,ra9ment Wher6in L iS to M °* « a KnX 

than the 3 end of the MOI; wherein w.thin a set. all members have the same T™ group, and the MOI fraoments have 

variable lengths that terminate wi* the same dideoxynucleotide selected from ddSJ '.SS^SSSP 
and wherein between sets, the T" 8 groups differ by at least 2 amu. ' 
[0022] In another aspect, the invention provides for a composition comprising a first Dluralitv of sets of com™, 

^T^TX"^^ ^ COmWnafi ° n 3 ~* S-T - seS^'^h^nX^ 
mula 7™ -L-MOI. wherein. T"* is an organic group detectable by mass spectrometry, comprising carbon at least one 
of hydrogen and I fluoride, and optional atoms selected from oxygen, nitrogen, sulfur pho^hcr^s^^e Us^n 
ojgne group which allows a ^-containing moiety to be cleaved from the remainder of the comp^rTwhertin ^ 

r^SS? C !T' SeS 3 <UnCti ° nal 9rOUP WhiCh Supports a sin 9 ,e ioni2ed <**»• stateTen the co^S 

is subjected to mass spectrometry and is selected from tertiary amine, quaternary amine and oroanic a <L S?«Tl 
nucleic acid fragment wherein L is conjugated to the MOI at a location oth'er thantte aTnd ileTl "J?JS Ta! 

S^ftT^^p^p h ^ S6qUenCe WWCh <erminateS the same d!deot^«; 

selected from ddAMP. ddGMP. ddCMP and ddTMP; with the proviso that the dideoxynucleotide Dresent in th* ™, m 
pounds of the first plurality is not the same dideoxynucleotide present in the cornpoSS TUSS 2£T 

2?2 T r PSCt ' inVenti ° n PrOVideS fof 3 « for DNA sequencinganalysis. The Wt^Ss^luralitv 
!l^K^T , r C l C ^f' ner SSt C ° mpriSinS at ,east f " e oonttnw, wherein a first container SSET. veS^ 
second, third, fourth and fifth containers contain compounds of the formula T^-L-MOl wherein V^ZZl^T 

selected from oxygen nitrogen, sulfur, phosphorus and iodine; L is an organic group which allows a ?»E«£5£ 
moiety to be cleaved from the remainder of the compound, wherein the T™-containing moSJ ^mStm^SS2 

ITZ^^T"* " Sin9 ' e Chaf9e St3te Wh6n tne «""PO"«C subjeciedr^ssTe^^aSt 

selected from tertiary amine, quaternary amine and organic acid- and MOI is a nucleic acid tamZh!!, 

jugated to the MO. at a location other than the 3' end of the MOI; such ttS £ ^^t^^M^^Z 
containers is identical and complementary to a portion of the vector within the seTd TconSnSs aS ££2££S 
wrthin each container is different from the other T™ groups in the kit containers, ana the V group 

[0024] In another aspect, the invention provides for a system for determining the sequence of a nucleic acid mole- 
cule. The system comprises a separation apparatus that separates tagged nucleic acXag^ente an apLrafu^S 
c eaves from a tagged nucleic acid fragment a tag which is correlative with a particular nudetfde tSdSSS^H 
electrochemical detection, and an apparatus for potentiostatic amperomefry. nuc, eot«e and detectable by 



may be separately identified 
to [0026] 



l ^L^ 111656 and ° th6r aspects of * e presem mention will become evident upon reference to the fallowing 
deta ed description and attached drawings. In addition, various references are set *JSS^^£JS^^ 
X Pr ° CedUr6S " —Prions <e.g. plasmids. etc,, and are therefore tarpJ^SSlS 



45 BRIEF DESCRIPTION OF THE DRAWINGS 
[0027] 



50 



55 



Figure 1 depicts the flowchart for the synthesis of pentaf luorophenyl esters of chemically cleavable mass spectros- 
copy tags, to liberate tags with carboxyl amide termini mass s P GCtros - 

cT^e^^ 

SSSKTiSST" ^ SXn1heSiS ° f -« - ■ set of 36 photochemicaHy 

s'SL'cStgs* 6 Synth6SiS ° f 3 561 ° f 36 photochemicaHy cleavable mass 

Figure 9 depicts the synthesis of 36 photochemicaHy cleavable mass spectroscopy tagged oligonucleotides made 
from the correspond set of 36 tetraf.uoropheny. esters of photochemicaHy deavab'e n^££S^ 



5 



EP 0 992 511 A1 

acids. 



70 



F.gure 1 0 depicts the synthesis of 36 photochemically cleavable mass soectrosrn™ ta^wi r > 

from the corresponding set of 36 amine-terminated X^^J^^^^^Tf^ 65 ™ d * 

Figure 1 1 iHustra.es the simuteneous detectdn of muS taTby ^^Z^T^^ 

Figure 12 shows the mass spectrogram of the alpha-cyano rrSrix a^ I**™"* 

Figure 13 dep.cts a modularly-constructed tagged nucleic acid fragment 

DETAILED DESCRIPTION OF THE INVENTION 

as havine me general formule: ' ^ con ^°"« s « »» invention ma, be vie«ed 



,5 



T-L-X 



20 



25 



T-L-MOI and T-L-Lh 
[0029] For reasons described in detail below sets of T-l -Mm mm^,. i_ 

tat cause the labile bond(s) to break, thus reding a 12 Z^T ^ PUrpOSS,y to conditions 

is then characterized by one or more an^J^cal ShntauLTn * D T re !" a,nder 01 «>"Pound. The tag moiety 

See S 3 r6P ~ e COmP ° Und 01 ^ '-^^ a .rect bond. 



50 Structure (i) 



J5 



40 




^(Nucleic Acid Fragrant) 
I 

H 

Linker (L) component 



Tag component Molecule of Interest 

componeni 



45 



SO 



amide bona la labile relative to the bonds fn T beoausT 5?So2Si h^? ^ 5 " ""*" »™*- ^ 

cleaved (broken) b, add or base eoncHon i JhkhtoTl m. £2? JT ' *" bond ™» be *amieajl, 
mole,, ft*, me Ceava 9 e prcdue, marc:^rCrbe , ^^r:orb:i^ POnen, ™- 8 »° 



55 



6 



EP0 992 511 A1 



Structure (i) 



70 



15 



20 




^(Nucleic Acid Fragrant) 
I 

H 

acid or base 



OH H m^*** Acid Fragment) 



Tag Moiety 



Remainder of the Corcpound 



25 



30 



35 



40 



45 



Structure (fi) 






NO, 


^N^ 









L 



1 



N" 
I 

H 



.(Nucleic Ackl 
Fragment) 



MOI 



select^ cleavage of the beniSe ^ ^^wThL'T' 00 ° f 3 WaVe,e " 9,h ««• 

the same T and MOI groups as strucLre 7) hSeTr £Z2 ^ ^ " StrUCtUre (H)) ThuS ' structure W 

there is a particularly TabS bon Tphttolys s StZE? T wrthin which 

remainder of the compound, as shown £3ow ( ° * 63S6S 3 139 m ° iety (T-oonWning moiety) from the 



50 



55 



EP 0 992 511 A1 



3 



Structure (ii) 



10 



15 



20 




(Nucleic Acid 
Fragment) 



.(Nucleic acid 
\ Fragment) 
H 



25 



Tag Moiety 



Remainder of the Compound 



30 



35 



40 



[0032] The invention thus provides compounds which, upon exposure to anornnriato h^w^ 

two groups togelher. Alternatively, the compounds of the invention maw hTh^JI-Z^ 7 'aDiietJond(s) which join the 

s » £ a "zzssss? ' chemicai tena,e ™ — — «• o-a 

T vc"Th 

To illustrate this nomenclature, reference may be made to structure nu\ ***** ~ * 



45 



50 



55 



Structure (ifi) 




Tag Variable 
Corrponent 



Tag 
Handle 



8 



EP0 992 511 AH 



w 



0034] J" structure («0 the tag handle (-C(-O)-A) simply provides an avenue for reacting the tag reactant with the 
hnker reactant to form a T-L moiety. The group "A" in structure (Hi) indicates that the carboxyl group is inTcheTcaUy 
actve state, so it .s ready for coupling with other handles. "A" may be. for example, a hydroxy! group or penSS? 
hoZf tT? ^ T' P 055 ™* 68 - The invention P"»«es a large number of possible lag har^^S Z be 

^T'I tT^T ^^ 35 diSCUSS6d in d6tail be,0W - 108 ^ variabla «>™P°ner5 is thus a part : J?- in 
the formula T-L-X. and w,ll also be part of the tag moiety that forms from the reaction that cleaves L 

[0035] As also discussed in detail below, the tag variable component is so-named because in preparing sets of 
compounds according to *e invention, it is desired that members of a set have unique variable ccTnpX^ sc tnafthe 
individual members may be distinguished from one another by an analytical technique. As one exalte ^e tag^vtTa 



15 



20 





30 



[0036] Ukew.se the linker reactant may be described in terms of its chemical handles (there are necessarilv at 
least two. each of which may be designated as W which flank a linker labile component. ZS^ZSSSZ^ 
* ponent consists of the required labile moiety (L*) and optional labile moieties (U and L 3 ). whence «2lltSL!S" 

3? TT? TZ" t0 S6Parate L f r ° m the Hand,eS ^ and •» la b"e n^ Mm^iiS^SSlM 

withm the linker labile component. Thus, the linker reactant may be seen to have the genera formula 

U,-L 1 -L 2 -L 3 -U 



[0037] The nomenclature used to describe the linker reactant may be illustrated in view of structure (M whirh 
again draws from the compound of structure (ii): sTruciure (iv), which 




50 



£5 



S ^ 'I 9 ( ' V) '" UStrate5 - atoms ma * serve in lhan one functional role. Thus, in structure (iv) the ben 
zyl nitrogen functions as a chemical handle in allowing the linker reactant to join to the tag reactani Z Z artde farm 
ZIV™^ .^ SeqU K en ^. also sarves a * a necessary part of the structure of the uSFXZTJZtZZ 
benzylic carbon-nrtrogen bond ,s particularly susceptible to photolytic cleavage. Structure (iv) al» lstra"es 1^ 
l.nker reactant may have an L 3 group (in this case, a methylene group), although not have anPZ. ES 
reactants may have an U group but not an L 3 group, or may have J and L 3 groups oTmay \j£Z£^£7l 

[0039] The MOI reactant ,s a suitably reactive form of a molecule of interest. Where the molecule of interest is a 



EP0 992 511 A1 



75 



handle and a tag variable component) a linker reactant ih 3 Z!.l . te9 reactar « (having a chemical tag 

and 0-2 optional labile moieties) and a MO TcZ^^ZT^Z"^ *** hand,eS ' 3 required ,abile 
of interest handle) to form T-L-MOI. Thu^ to^brm T^MO? "^^ u ^ of interest component and a chemical molecule 
reacted together to provide T-L- U . and ^ ^^T^i^^ are *« 

10 ('ess preferably) the Dnker reactant and the MOI reactant a^7i^™ £^5 S * pr0Vlde T - L - M01 - w e,se 
MO. is reacted with the tag reactant to provkie T L MoT^r n i ^ ,irSt l ° provide ^MOI. and then L.-L- 

T-L-MO. will be described in terrrt * ^^relc^meX^a^^ 6 ^; C ° mP ° UndS ^ fo ™ ,a 
form such compounds. Of course, the same compounds ,5 Ste?f moT2,,?h^° "** may be used to 

[0041] In any event, the invent on provides that a T-i ynim^ ,7 una5 - 

a tag moiety is released from the r ^ <^^TS m 10 deaV39e eondhtan * «* that 

component, and wi.l typically additionally conprfee somToTal. ^aZfT * variab,e 
atoms from the linker handle that was used *i£5£Z r Z2Ji£ 1! * * 9 «"* W 9,1 <* *• 

this group was present in T-L-MOI. and wfu oemaos c ?l?n^ - « ^ ' MCtent ,he ° pfonal ,abile "»krty U if 
the precise structure of L* and the £££ »e SSJZS'E m0iety L * depend ^ « 

to as the T-containing moiety because tISm^ZS^T^ COnven,ence ' moiety may be referred 
[00,2, Given this introduction to one a W thTpr^T" XnT" ? ~ ^ « «" *° ™ e * 
described in detail. This description begins w^e Wta^S-T^ Va " 0US com P°"«"ts T. L and X will be 
in describing T. L and X. ^ * to "° W,n9 da "™*°ns of certain terms, which will be used hereinafter 



30 



35 



40 



45 



SO 



55 



25 [0043] As used herein, the term "nucleic ariri frar,™«„*- ~ 

targe, nucleic acid molecule (i.e.. conjEES to Tor a SEE I T ?"* * ^'^^ to a selected 
thetically or recombinantly P^uc^S^n^^^^Z^^^ ? * 0m ^ °' Sy "- 

stranded form where appropriate: and includes an <*^Z^a D^MnT "** * ^ W Sin ^ ,e 
analog (e.g., PNA), an oligonucleotide which is extended ^??to? 'tfSl^ \ 3 P " mer - 3 probe ' a nudeic acid 
cleaved chemically or enzymatically a nucleic aSStfc ermfnlt^ * 3 f a acid which is 

end with a compound that prever^polymeriL^on TtoV?« TS^Sf^ term,nator or <W at the 3' or 5' 
of a nucleic acid fragment to a select J target nuc.Sc acid 1^^^™*°™*"™* ^ com P la ™ntarity 
70% specific base pairing throughout the lengih of S iSSSStaSK? T * eXhibiti ° n °* « ,east 
about 80% specific base pairing; and most preferably at SSi3ST5S "* fra9ment 6xhibits * 

(and thus the percent specific base pairing) a^^Z^i^^J^? 1W J«™™ n W ^percent mismatch 
function of the Tm when referenced to the Wy b J! Sconfro. m ^ P6rCent mismateh as a 

E U^^^ a -urated. sfraight^hain or hranched- 

Examp.es of such radicals include, but are n£ I^ZTJTJm T* fr ° m 1 to 4 ' a *0"«- 

tert-butyt. pentyl. iso-amyl. hexy.. decyl and the ^1 i££?££^ '^o^. iso^jutyt. sec*utyl. 

branched chain hydrocarbon dira'dicai^oma^ Saturated - straight-chain or 

=r .™o,suchd^^ 

^atle^rca"^^ 

2 to 4. carbon atoms. Examples of such raSs IncluS buT^ ' ^ ft °* B 2 ,0 6 *** m P referab, V fr ^ 
E- and 2-butenyf. E- and 2-isobutenyl E- and Senten v 7 d™ ? E_ and 2 -P ro P^ isopropenyl. 

straigW^hainorbranched^hainhydro^Ln^radr^^^^^^^^^ ^l"" 6 - The term '^^e' refers to a 

2 to 10. preferably from 2 to 6 and more pie^S torn 2T4 ^irhnn T Carb £ ° n - carbon dou b.e bond in a total of from 
are not limited .0. methylene (=CH 2 ). eLlSene f CH-CH at °n;s > Examples of such diradicals include, but 

[0046] The term "alkynyl." alone o in^mb,n a ^ ^^o^ ^ Pr ° Py " de " e (-CH 2 -CH=CH-) and the like, 
having at least one caZLbon tnp e b^nd T Z T^Tw^Z 0 ^^^ -dice. 
2 to 4. carbon atoms. Examples of such radfcLls indude bT^ nnt i P V ^ m 2 to 6 and ™* Preferably from 
gyl). butyn^. hexynyl. decynyl and the like. t5!S?£5£ S^^Sr? (aCety ' en ^- PrOPyny ' (Pr ° Par ' 
branched-chain hydrocarbon diradical having at least one camon ° T combination, refers to a straight-chain or 

from 2 to 6 and more preferably from 2 to 4 cartoon ^s ta^T^TT .TV" 3 ^ ° f ,r ° m 2 to 10 " 
nyiene (-C-C-). propylene (.CH 2 -C^C-) andThe l^e rad ' Ca ' S indUde ' ^ are no1 ,imite d. ethy- 

10047) The term "cyc.oalkyl." alone or in combination, refers to a saturated, cyclic arrangement of carbon atom, 



10 



EP0 992 511 A1 



20 



25 



30 



35 



40 



45 



50 



form of a cycloalkenyl. opemao.enyi and the like. The term cycloalkenylene" refers to a diradical 

S gC '^^'^S^^I^^X "2 hyd H r ° 9en) — 6 9r0 ^ 
aromatic group setected from fre group^cS^^^ ^ ^ anthracen y': « * heterocyclic 

zo.yi. 2-pyrazo.inyl. pyrazolidiny.. LLy. 2i£oH ^ 3 oSo.7^ TS^^ Wd " 0 '* W ^ 
dazinyl. pyrimidinyl. pyrazinyl 1 3 5-triazinvl i vJw£ . -.^ • ' ' ' S^ 1320 '* 3. 4-thiadiazolyl. pyri- 
benzofb^ny, * S^hydrc^ MoE* 
quinolizinyl. quinolinyl. isoquinolinyl. cinnolinyl phthaJ^inVl aZLTr^Z yl> benzi, " ,daz benzth.azotyl. purinyt.4H- 
ca I bazo.y..acridiny.. P hena.iny..pLothiazinyV P a^^^ qu.noxa„ny,. t. 8-naphthyridiny,. pteridiny.. 

- » four subsets which are 

omethoxy. a.ky.. a.kenyl. alkynyl cTano. carS^ ^b^Kow ? ^ ^T^*^^ 
alkylamino. a.keny.aminc, alkynylamino. aliphSTor a™a£ acy °' Bi ^- 

pholinc^rDonylamino.thic.morphoHnocamony.amino iSSmiSn? Sj!f ^ ^ ^^"y'amino. mor- 
yurea; N-hydroxylurea; N-alkenylurea; N N-(a^k^ ^droxvnu^^Tat ara 'Marninosuffonyl; aralkoxyalkyl; N-aralkox- 
a.ky.)hydrazino ; Ar'-substituted sulfony heter^S^ th.oaryloxy-substituted aryt; N.N-(aryi. 

tuted heterocycly.; cycloalkyMused aryl; arytaxj -substiS atHeSr^M ' * ^fT^ ^'^eny.-substi- 
amonyl; aliphatic or aromatic acyl^ubstrtutea* altenyTS 2£JK£?!^^ 

aliphatic or aromatic acy.-subst.tuL ac^ydoa^^^^ ^'^^ «* 

phosphorodiamidyl acid or ester; y ^ ' cyc'oalkyf-substrtuted amino: aryloxycarbonyialkyt: 

alkynyl. IJMoxynwhylana. l.2^^„r'£^l^^^„ ^LT 8 "* 1 - """■"'■•"W alkyt. alkanyl. 

propoxy. n-butoxy, iso-butoxy. sec-butoxy. tert-butoxy and the like ^' **' "'P™^ 'so- 

Syr ^SJSr^S Sl fc JKS ? T — — * term 

is? bu, r not ,imited to - a,,yioxy - e - - ^^To^^Ze sr*" * suitaue a,kenoxy radica,s 

ir^e. but are not limited to. propargyloxy. 2-^.^ anTthe liL EXamP,6S ° f SUitab ' e """"«* «*- 

a radical of formula a.ky«-NH- or a°kX-N-)^ 

alkylamino radicals include, but are not limH^ to TllT ^ J! 35 defined abova ^P 1 ^ <* suitable 
buty.amino.N.N^etrylaminoandt^ lite ^thylam.no. ethylamino, propylamino. iscpropylamnino. t- 

En me^rityrTs'a^ °< alkenyl-NH- or <a,keny,) 2 N, 

nylamino radicals is the ally.amino rascal P * *" ^ * TOt enamine - ****** of such alV 

stl ^r^^^: s^rsits r mu,a a, r - nh - ° r (a ~- 

nylaminoradicalsislh.pfcpargyl.mi^,^ ""«•■"**««« ynamma. Areaamd.ofsuitafty. 

San L^jins^i!rj2222ss ir it'- — ri s — - >— . 

.he, ,« m -^subsautad ania^ara £SXS£$ **" Whe " B " M 
t0061J Tha tarm -.rylamiao.- "W*oxy. cy-idylmy and Ihe like. 



2> D 

EP 0 992 511 A1 

. 3- and 4-pyridylamino and the like. 

[0062] The term "aryl-fused cydoalkyl." alone or in combination, refers to a cycloalkyl radical which shares two 
adjacent atoms with an aryl radical, wherein the terms "cycloalkyl" and "aryl" are as defined above. An example of an 
aryl-fused cycloalkyl radical is the benzofused cyclobutyl radical. 

{0063] The term "alkylcaibonylamino." alone or in combination, refers to a radical of formula alkyl-CONH wherein 
the term "alkyl" is as defined above. 

[0064] The term "alkoxycarbonylamino." alone or in combination, refers to a radical of formula alkyl-OCONH- 
wherein -the term "alkyl" is as defined above 

[0065] The term "alkylsulfonylamino." alone or in combination, refers to a radical of formula alkyl-SO,NH- wherein 
the term "alkyl" is as defined above. 2 wne,e,n 

[0086] The term "arylsulfonylamino." aJone or in combination, refers to a radical of formula aryl-SO,NH- wherein 
the term "aryl" is as defined above. 

[0067] The term -N-alkylurea." alone or in combination, refers to a radical of formula alkyl-NH-CO-NH- wherein the 
term "alkyl" is as defined above. 

[0088] The term "N-arylurea." alone or in combination, refers to a radical of formula aryl-NH-CO-NH- wherein the 
term "aryl" is as defined above. 

[0089] The term "halogen" means fluorine, chlorine, bromine and iodine. 

[0070] The term "hydrocarbon radical" refers to an arrangement of carbon and hydrogen atoms which need only a 
smgle hydrogen atom to be an independent stable molecule. Thus, a hydrocarbon radical has one open valence site on 
a carbon atom, through which the hydrocarbon radical may be bonded to other atom(s). Alkyl. alkehyl cycloalkvl etc 
are examples of hydrocarbon radicals. ' 
[0071] The term "hydrocarbon diradical" refers to an arrangement of carbon and hydrogen atoms which need two 
hydrogen atoms in order to be an independent stable molecule. Thus, a hydrocarbon radical has two open valence sites 
on one or two carbon atoms, through which the hydrocarbon radical may be bonded to other atom(s) Alkylene alke- 
nylene. alkynylene, cycloalkylene. etc. are examples of hydrocaibon diradicals. 

[0072] The term "hydrocaibyl" refers to any stable arrangement consisting entirely of carbon and hydrogen havinc 
a single valence srte to which it is bonded to another moiety, and thus includes radicals known as alkyl alkenyl alkynyl 
cydoalkyl. cycloalkenyl. aryl (without heteroatom incorporation into the aryl ring), arylalkyl. alkylaryl and the like Hydro- 
carbon radical is another name for hydrocarbyl. 

[0073] The term "hydrocarbylene" refers to any stable arrangement consisting entirely of carbon and hydrogen hav- 
ing two valence sites to which it is bonded to other moieties, and thus includes alkylene. alkenylene alkynylene 
cycloalkylene. cycloalkenylene. arylene (without heteroatom incorporation into the arylene ring), arylalkylene alkv- 
larylene and the like. Hydrocaibon diradical is another name for hydrocarbylene. 

[0074] The term "hydrocaibyl-O-hydrocarbylene" refers to a hydrocarbyl group bonded to an oxygen atom where 
the oxygen atom is likewise bonded to a hydrocarbylene group at one of the two valence sites at which the hydrocarb- 
ylene group is bonded to other moieties. The terms "hydrocarbyl-S-hydrocarbylene". "hydrocarbyl-NH-hydrocarbylene" 
and "hydrocarbyl-amide-hydrocarbylene" have equivalent meanings, where oxygen has been replaced with sulfur -NH- 
or an amide group, respectively. ' 

[0075] The term N-(hydrocarbyl)hydrocarbylene refers to a hydrocarbylene group wherein one of the two valence 
sites is bonded to a nitrogen atom, and that nitrogen atom is simultaneously bonded to a hydrogen and a hydrocarbyl 
group. The term N.N-di(hydrocarbyl)hydrocarbylene refers to a hydrocarbylene group wherein one of the two valence 
sites is bonded to a nitrogen atom, and that nitrogen atom is simultaneously bonded to two hydrocarbyl groups. 
[0076] The term "hydrocarbylacyl-hydrocarbylene" refers to a hydrocarbyl group bonded through an acyl f-C(=OM 
group to one of the two valence sites of a hydrocarbylene group. 

[0077] The terms "heterocyclylhydrocarbyl" and "heterocylyl" refer to a stable, cyclic arrangement of atoms which 
include carbon atoms and up to tour atoms (referred to as heteroatoms) selected from oxygen, nitrogen, phosphorus 
and sulfur. The cyclic arrangement may be in the form of a monocyclic ring of 3-7 atoms, or a bicydic ring of 8-1 1 atoms 
The rings may be saturated or unsaturated (including aromatic rings), and may optionally be benzofused Nitrogen and 
sulfur atoms in the ring may be in any oxidized form, including the quaternized form of nitrogen. A heterocydylhydro- 
carbyl may be attached at any endocyclic carbon or heteroatom which results in the creation of a stable structure Pre- 
fened heterocyclylhydrocarbyls include 5-7 membered monocyclic heterocycles containing one or two nitroqen 
heteroatoms. 

[0078] A substituted heterocyclylhydrocarbyl refers to a heterocyclylhydrocarbyl as defined above wherein at least 
one ring atom thereof is bonded to an indicated substituent which extends off of the ring 

[0079] In referring to hydrocarbyl and hydrocarbylene groups, the term "derivatives of any of the foregoing wherein 
one or more hydrogens is replaced with an equal number of fluorides" refers to molecules that contain carbon hvdroaen 
and fluoride atoms, but no other atoms. ' 
[0080] The term "adivated ester" is an ester that contains a "leaving group" which is readily displaceable by a 



12 



20 



25 



CD GJ 

EP0 992 511 A1 

nucleophile. such as an amine, an alcohol or a thiol nucleophile. Such leaving groups are well known and include, with- 
out limitation, N-hydroxysuccinimide, N-hydroxybenzotriazole. halogen (halides), alkoxy including tetrafluoropheno- 
lates, thioalkoxy and the like. The term "protected ester refers to an ester group that is masked or otherwise unreactive 
See, e.g.. Greene, "Protecting Groups In Organic Synthesis." 

[0081] In view of the above definitions, other chemical terms used throughout this application can be easily under- 
stood by those of skill in the art. Terms may be used atone or in any combination thereof. The preferred and more pre- 
ferred chain lengths of the radicals apply to all such combinations. 

A. GENERATION OF TAGGED NUCLEIC ACID FRAGMENTS 

[0032] As noted above, one aspect of the present invention provides a general scheme for DNA sequencing which 
allows the use of more than 16 tags in each lane; with continuous detection, the tags can be detected and the sequence 
read as the size separation is occurring, just as with conventional fluorescence-based sequencing. This scheme is 
applicable to any of the DNA sequencing techniques based on size separation of tagged molecules. Suitable tags and 
linkers for use within the present invention, as well as methods for sequencing nucleic acids, are discussed in more 
detail below. 

1. Tags 

[0083] Tag", as used herein, generally refers to a chemical moiety which is used to uniquely identify a "molecule 
of interest", and more specifically refers to the tag variable component as well as whatever may be bonded most closely 
to it in any of the tag reactant. tag component and tag moiety. 

[0084] A tag which is useful in the present invention possesses several attributes: 

1) It is capable of being distinguished from ail other tags. This discrimination from other chemical moieties can be 
based on the chromatographic behavior of the tag (particularly after the cleavage reaction), its spectroscopic or 
potentometnc properties, or some combination thereof. Spectroscopic methods by which tags are usefully distin- 
guished include mass spectroscopy (MS), infrared (IR). ultraviolet (UV). and fluorescence, where MS. IR and UV 
are preferred, and MS most preferred spectroscopic methods. Potentiometric amperometry is a preferred DOtenti- 

30 ometric method. 

2) The tag is capable of being detected when present at 10~ 22 to 10" 6 mole. 

3) The tag possesses a chemical handle through which it can be attached to the MOI which the tag is intended to 
uniquely identify. The attachment may be made directly to the MOI. or indirectly through a linker" group. 

4) The tag is chemically stable toward all manipulations to which it is subjected, including attachment and cleavage 
35 from the MOI, and any manipulations of the MOI while the tag is attached to it. 

5) The tag does not significantly interfere with the manipulations performed on the MOI while the tag is attached to 
it. For instance, if the tag is attached to an oligonucleotide, the tag must not signif icantly interfere with any hybridi- 
zation or enzymatic reactions (e.g., PCR sequencing reactions) performed on the oligonucleotide. Similarly if the 
tag is attached to an antibody, it must not significantly interfere with antigen recognition by the antibody. 

[0085] A tag moiety which is intended to be detected by a certain spectroscopic or potentiometric method should 
possess properties which enhance the sensitivity and specificity of detection by that method. TypicaJly the tag moiety 
will have those properties because they have been designed into the tag variable component, which will typically con- 
stitute the major portion of the tag moiety. In the following discussion, the use of the word "tag" typically refers to the tag 
moiety (i.e.. the cleavage product that contains the tag variable component), however can also be considered to refer 
to the tag variable component itself because that is the portion of the tag moiety which is typically responsible for pro- 
viding the uniquely detectable properties. In compounds of the formula T-L-X, the "T portion will contain the tag variable 
component. Where the tag variable component has been designed to be characterized by, e.g., mass spectrometry the 
■r portion of T-L-X may be referred to as T™. Likewise, the cleavage product from T-L-X that contains T may be 
referred to as the T™-containing moiety. The following spectroscopic and potentiometric methods may be used to char- 
acterize T™ s -containing moieties. 

a. Characteristics of MS Tags 

[0086] Where a tag is analyzable by mass spectrometry (i.e.. is a MS-readable tag. also referred to herein as a MS 
tag or T^-containmg moiety"), the essential feature of the tag is that it is able to be ionized. It is thus a preferred ele- 
ment in the design of MS-readable tags to incorporate therein a chemical functionality which can carry a positive or neg- 
ative charge under conditions of ionization in the MS. This feature confers improved efficiency of ion formation and 



40 



45 



50 



13 



EP 0 992 511 Al 



TO 



15 



greater overall sensitivity of detection, particularly in electrospray ionization. The chemical functionality that supports an 
.on. Z ed charge may derive from T™ or L or both. Factors that can increase the relative sensitivity of an analyte being 
detected by mass spectrometry are discussed in. e.g.. Surmer. J., et al. Anal. Chem. 60:1300-1307 (1988). 
[0087] A preferred functionality to facilitate the carrying of a negative charge is an organic acid, such as phenolic 
hydroxy!, carboxylic acid, phosphonate. phosphate, tetrazole. sulfonyl urea, perfluoro alcohol and sulfonic acid 
[0088] Preferred functionality to facilitate the carrying of a positive charge under ionization conditions are aliphatic 
or aromatic amines. Examples of amine functional groups which give enhanced detectability of MS tags include quater- 
nary amines (/.e. amines that have tour bonds, each to carbon atoms, see Aebersold. U.S. Patent No 5540 859) and 
tertiary amines (i.e.. amines that have three bonds, each to carbon atoms, which includes C=N-C groups such as are 
present in pyridine, see Hess et al.. Anal. Biochem. 224:373. 1995; Bures et al.. Anal. Biochem 224 364 1995) Hin- 
dered tertiary amines are particularly preferred. Tertiary and quaternary amines may be alkyl or aryl A T™-conteinina 
moiety must bear at least one ionizable species, but may possess more than one ionizable species The preferred 
charge state is a single ionized species per tag. Accordingly, it is preferred that each -r s -containing moietv (and each 
tag variable component) contain only a single hindered amine or organic acid group. 

[0089] Suitable amine-containing radicals that may form part of the r™-containing moiety indude the following: 



20 



25 



t-C3 ; s~~ < ( = /~°~ (C2 ~ c,o) ~ N(c, ~ c ' o)2 



(C,—C l0 ) 



|— -(C,-C, Q )-i/ ^ ; 



h0 



30 




N-(C,-C I0 ); 



|-(C,-C, 0 )-N"^| 



35 



40 



45 



.N, 



-CNH 
II 
O 



— (C,— C,o)— Qjj . 



<9i— c io) 



-CNH-(C 2 -C I0 >-N 

o 




-CNH-(C 2 — C I0 )-N O ; 
O ^ / 



-CNH-(C,-C I0 




(C,— C )0 ) 



so 



55 



-CNH-(C 2 — C.ay-NfC-Co^ ; 

o 



-CN N(C,— C, 0 ) ; and 
O — ' 



—CNH-(C 2 -C 10 )-N^j . 




14 



EP0 992 511 A1 



[0090] The identification of a tag by mass spectrometry is preferably based upon its molecular mass to charge ratio 
(m/z). The preferred molecular mass range of MS tags is from about 100 to 2,000 daltons, and preferably the ^-con- 
taining moiety has a mass of at least about 250 daltons, more preferably at least about 300 daltons, and still more pref- 
erably at least about 350 daltons. It is generally difficult for mass spectrometers to distinguish among moieties having 
parent ions below about 200-250 daltons (depending on the precise instrument), and thus preferred ^-containing 
moieties of the invention have masses above that range. 

[0091] As explained above, the T^-containing moiety may contain atoms other than those present in the tag vari- 
able component, and indeed other than present in T™ 5 itself Accordingly, the mass of T™ itself may be less than about 
250 daltons. so long as the T^-containing moiety has a mass of at least about 250 daltons. Thus, the mass of l ms may 
range from 15 (/.e., a methyl radical) to about 10,000 daltons. and preferably ranges from 100 to about 5.000 daltons, 
and more preferably ranges from about 200 to about 1 .000 daltons. 

[0092] ft is relatively difficult to distinguish tags by mass spectrometry when those tags incorporate atoms that have 
more than one isotope in significant abundance. Accordingly, preferred T groups which are intended for mass spectro- 
scopic identification (T ms groups), contain carbon, at least one of hydrogen and fluoride, and optional atoms selected 
from oxygen, nitrogen, sulfur, phosphorus and iodine. While other atoms may be present in the T" s , their presence can 
render analysis of the mass spectral data somewhat more difficult. Preferably, the T" 18 groups have only carbon, nitro- 
gen and oxygen atoms, in addition to hydrogen and/or fluoride. 

[0093] Fluoride is an optional yet preferred atom to have in a I™ group. In comparison to hydrogen, fluoride is. of 
course, much heavier. Thus, the presence of fluoride atoms rather than hydrogen atoms leads to T™ 5 groups of higher 
mass, thereby allowing the T™ 8 group to reach and exceed a mass of greater than 250 daltons, which is desirable as 
explained above. In addition, the replacement of hydrogen with fluoride confers greater volatility on the T^-containing 
moiety, and greater volatility of the analyte enhances sensitivity when mass spectrometry is being used as the detection 
method. 

[0094] The molecular formula of T™ falls within the scope of C 1 . 5 ooN 0 .iooOo.iooSo.ioPo-ioH a Fpl 6 wherein the sum 
of a. 0 and 8 is sufficient to satisfy the otherwise unsatisfied valencies of the C. N, O. S and P atoms. The designation 
Ci-5ooNo-iooOo-iooSo-io p o-io H a F pls means that J ms contains at least one, and may contain any nurrtoer from 1 to 500 
carbon atoms, in addition to optionally containing as many as 100 nitrogen atoms ("N 0 ." means that 1™* need not con- 
tain any nitrogen atoms), and as many as 100 oxygen atoms, and as many as 10 sulfur atoms and as many as 10 phos- 
phorus atoms. The symbols a. 0 and 5 represent the number of hydrogen, fluoride and iodide atoms in I™, where any 
two of these numbers may be zero, and where the sum of these numbers equals the total of the otherwise unsatisfied 
valencies of the C. N, O, S and P atoms. Preferably, T™ 8 has a molecular formula that falls within the scope of C V50 No. 
kAmo h o f 0 wh e r e the sum of a and 0 equals the number of hydrogen and fluoride atoms, respectively, present in the 
moiety. 



b. Characteristics of IR Tags 



[0095] There are two primary forms of IR detection of organic chemical groups: Raman scattering IR and absorp- 
tion IR. Raman scattering IR spectra and absorption IR spectra are complementary spectroscopic methods. In general, 
Raman excitation depends on bond polarizability changes whereas IR absorption depends on bond dipole moment 
changes. Weak IR absorption lines become strong Raman lines and vice versa. Wavenumber is the characteristic unit 
for IR spectra. There are 3 spectral regions for IR tags which have separate applications: near IR at 12500 to 4000 cm' 
1 , mid IR at 4000 to 600 cm* 1 , far IR at 600 to 30 cm 1 . For the uses described herein where a compound is to serve 
as a tag to identify an MOI. probe or primer, the mid spectral regions would be preferred. For example, the carbonyl 
stretch (1850 to 1750 cm' 1 ) would be measured for carboxylic acids, carboxylic esters and amides, and alkyl and aryl 
carbonates, carbamates and ketones. N-H bending (1750 to 160 cm' 1 ) would be used to identify amines, ammonium 
ions, and amides. At 1400 to 1250 cm* 1 . R-OH bending is detected as well as the C-N stretch in amides. Aromatic sub- 
stitution patterns are detected at 900 to 690 cm" 1 (C-H bending, N-H bending for ArNH 2 ). Saturated C-H. olefins, aro- 
matic rings, double and triple bonds, esters, acetals. ketals, ammonium salts, N-O compounds such as oximes. nitro, 
N-oxides, and nitrates, azo, hydrazones. quinones. carboxylic acids, amides, and lactams all possess vibrational infra- 
red correlation data (see Pretsch et al., Spectral Data for Structure Determination of Organic Compounds. Springer- 
Verlag, New York. 1989). Prefened compounds would include an aromatic nitrile which exhibits a very strong nitrile 
stretching vibration at 2230 to 2210 cm' 1 . Other useful types of compounds are aromatic aikynes which have a strong 
stretching vibration that gives rise to a sharp absorption band between 2140 and 2100 cm* 1 . A third compound type is 
the aromatic azides which exhibit an intense absorption band in the 2160 to 2120 cm -1 region. Thiocyanates are repre- 
sentative of compounds that have a strong absorption at 2275 to 2263 cm' 1 . 



D D 

EP 0 992 511 A1 

c. Characteristics of UV Tags 

[0096] A compilation of organic chromophore types and their respective UV-visible properties is given in Scott 
{Interpretation of the UV Spectra of Natural Products. Permagon Press. New York, 1962). A chromophore is an atom 
or group of atoms or electrons that are responsible for the particular light absorption. Empirical rules exist for the n to 
*• maxima .n conjugated systems (see Pretsch et al.. Spectral Data for Structure Determination of Omanic 
Compounds, p. B65 and B70. Springer-Verlag. New York. 1989). Preferred compounds (with conjugated systems) 
would possess n to *' and * to it* transitions. Such compounds are exemplified by Acid Violet 7 Acridine Oranoe Acri 
dine Yellow G. Brilliant Blue G. Congo Red. Crystal Violet. Malachite Green oxalate. Metanil Yellow Methylene Blue 
Methyl Orange. Methyl Violet B. Naphtol Green B. Ofl Blue N. Oil Red O. 4-phenyla2ophenol. Safranie O Solvent Green 
3. and Sudan Orange G. all of which are commercially available (Aldrich, Milwaukee. Wl). Other suitable comoounds 
are listed in, e.g., Jane, I., et al., J. Chrom. 323:191-225 (1985). faunas 

d. Characteristic of a Fluorescent Tag 

[0097] Fluorescent probes are identified and quantitated most directly by their absorption and fluorescence emis- 
sion wavelengths and intensities. Emission spectra (fluorescence and phosphorescence) are much more sensitive and 
permit more specrf.c measurements than absorption spectra. Other photophysical characteristics such as excited-state 
lifetime and fluorescence anisotropy are less widely used. The most generally useful intensity parameters are the molar 
extinction coefficient (c) for absorption and the quantum yield (QY) for fluorescence. The value of e is specified at a sin- 
gle wavelength (usually the absorption maximum of the probe), whereas QY is a measure of the total photon emission 
over the entire fluorescence spectral profile. A narrow optical bandwidth (<20 nm) is usually used for fluorescence exci- 
tation (via absorption), whereas the fluorescence detection bandwidth is much more variable, ranging from full soec- 
trum tor maximal sensitivity to narrow band (-20 nm) for maximal resolution. Fluorescence intensity per probe molecule 
is proportional to the product of £ and QY. The range of these parameters among fluorophores of current practical 
importance is approx,mately 10.000 to 100.000 crn 'M" 1 for c and 0.1 to 1.0 for QY. Compounds that can serve as flu- 
orescent tags are as follows: fluorescein, rhodamine. lambda blue 470. lambda green, lambda red 664 lambda red 665 
acridine orange, and propidium iodide, which are commercially available from Lambda Fluorescence Co (Pleasant 
Gap PAl Fluorescent compounds such as nile red. Texas Red. lissamine™, BODIPY'" s are available from Molecular 
rroDes (Eugene. OR). * 

e. Characteristics of Potentiometric Tags 

[0098] The principle of electrochemical detection (ECD) is based on oxidation or reduction of compounds which at 
certain applied voltages, electrons are either donated or accepted thus producing a current which can be measured 
When certam compounds are subjected to a potential difference, the molecules undergo a molecular rearrangement at 
the working electrodes" surface with the loss (oxidation) or gain (reduction) of electrons, such compounds are said to 
1ST": !2?„ U T e,ectro *« mi ^ reactions. EC detectors apply a voltage at an electrode surface over which 
the HPLC eluent flows. Eiectroactive compounds eluting from the column either donate electrons (oxidize) or acouire 
electrons (reduce) generating a current peak in real time. Importantly the amount of current generated depends onboth 
the concentration of the analyte and the voltage applied, with each compound having a specific voltage at which it 
begins to oxidize or reduce. The currently most popular electrochemical detector is the amperometric detector in which 
the potent.al is kept constant and the current produced from the electrochemical reaction is then measured This tvoe 
of spirometry is currently called "potentiostatic amperometry". Commercial amperometers are available from ESA 
Inc., Chelrrrford, MA. • 

[0099] When the efficiency of detection is 1 00%. the specialized detectors are termed "coulometric" Coulometric 
detectors are sensitive which have a number of practical advantages with regard to selectivity and sensitivity which 
make these types of detectors useful in an array. In coulometric detectors, for a given concentration ol analyte the sio- 
nal current is plotted as a function of the applied potential (voltage) to the working electrode. The resultant sigmoidal 
graph is called the current-voltage curve or hydrodynamic voltammagram (HDV). The HDV allows the best choice of 
applied potential to the working electrode that permits one to maximize the observed signal. A major advantage of ECD 
is its inherent sensitivity with current levels of detection in the subfemtomole range. 

[0100] Numerous chemicals and compounds are electrochemical ly active including many biochemicals Dharma- 
ceuticals and pesticides. Chromatographically coeluting compounds can be effectively resolved even if their half-wave 
potentials (the potential at half signal maximum) differ by only 30-60 mV. 

[0101] Recently developed coulometric sensors provide selectivity, identification and resolution of co-elutino com- 
pounds when used as detectors in liquid chromatography based separations. Therefore, these arrayed detectors add 
another set of separations accomplished in the detector itself. Current instruments possess 16 channels which are in 



16 



03 CD 

EP 0 992 511 A1 

principle limited only by the rate at which data can be acquired. The number of compounds which can be resolved on 
the EC array is chromatographically limited (i.e.. plate count limited). However, if two or more compounds that chroma- 
tographically co-elute have a difference in half wave potentials of 30-60 mV, the array is able to distinguish the com- 
pounds. The ability of a compound to be electrochemically active relies on the possession of an EC active group (i e - 
£ OH, -O, -N, -S). 

[0102] Compounds which have been successfully detected using coulometric detectors include 5-hydroxytryp- 
tamine. 3-methoxy-4-hydroxyphenyl-glycol, homogentisic acid, dopamine, metanephrine, 3-hydroxykynureninr. ace- 
tominophen, 3-hydroxytryptophol, 5-hydroxyindoleacetic acid, octanesulfonic arid, phenol, o-cresol, pyrogallol, 2- 
nitrophenol, 4-nitrophenol. 2,4-dinitrophenol, 4,6-dinitrocresol, 3-methy1-2-nitrophenol, 2,4-dichlorophenol. 2,6-dichlo- 

w rophenol. 2,4,5-trichlorophenol, 4-chloro-3-methylphenol, 5-methylphenol, 4-methyl-2-nitrophenol f 2-hydroxyaniline, 4- 
hydroxyaniline, 1 ,2-phenylenediamine, benzocatechin. buturon, chlortholuron, diuron, isoproturon, linuron. methobro- 
muron, metoxuron, monolinuron. monuron, methionine, tryptophan, tyrosine. 4-aminobenzoic acid. 4-hydroxybenzoic 
acid. 4-hydroxycoumaric acid, 7-methoxycoumarin. apigenin baicalein. caffeic acid, catechin, centaurein, chlorogenic 
acid, daidzein, datiscetin, diosmetin, epicatechin galiate. epigallo catechin, epgallo catechin gallate. eugenol, eupa- 

15 torin, ferulic acid, fisetin, galangin, gallic acid, gardenia genistein, gerrtisic acid, hesperidin, irigenin, kaemferoJ, leucoy- 
anidin. luteolin, mangostin, morin, myricetin, naringin, narirutin, pelargondin. peonidin. phloretin, pratensein, 
protocatechuic acid, rhamnetin, quercetin. sakuranetin. scutellarein. scopoletin, syringaldehyde, syringic arid, tangeri- 
tin, troxerutin. umbelliferone, vanillic acid, 1.3-dimethyl tetrahydroisoquinoline, 6-hydroxydoparrune, r-salsolinol, N- 
methyl-r-salsolinol. tetrahydroisoquinoline, amitriptyline. apomorphine. capsaicin, chlordiazepoxide. chlorpromazine. 

20 daunorubirin, desipramine. doxepin. fluoxetine, flurazepam. imipramine, isoproterenol, methoxamine. morphine, mor- 
phine-3-glucuronide. nortriptyline, oxazepam, phenylephrine, trimipramine. ascorbic acid, N-acetyl serotonin, 3,4-dihy- 
droxybenzylamine, 3,4-dihydroxymandelic arid (DOMA), 3,4-dihydroxyphenylacetic acid (DOPAC). 3.4- 
dihydroxyphenylalanine (L-DOPA), 3,4-dihydroxyphenylglycol (DHPG), 3-hydroxyanthranilic acid, 2-hydroxyphenylace- 
tic acid (2HPAC). 4-hydroxybenzoic arid (4HBAC). 5-hydroxyindole-3-acetic acid (5HIAA). 3-hydroxykynurenine, 3- 

25 hydroxymandelic arid. 3-hydroxy-4-methoxyphenylethylamine, 4-hydroxyphenylacetic arid (4HPAC). 4-hydroxyphenyl- 
lactic arid (4HPLA). 5-hydroxytryptophan (5HTP), 5-hydroxytryptophol (5HTOL). 5-hydroxytryptamine (5HT). 5-hydrox- 
ytryptamine sulfate. 3-methoxy-4-hydroxyphenylglycol (MHPG). 5-methoxytryptarnine. 5-methoxytryptophan. 5- 
methoxytryptophol. 3-methoxytyramine (3MT). 3-methoxytyrosine (3-OM-DOPA), 5-methylcysteine. 3-methylguanine. 
bufotenin, dopamine dopamine-3-glucuronide, dopamine-3-sulfate. dopamine-4-sulfate, epinephrine, epinine. folio 

30 acid, glutathione (reduced), guanine, guanosine. homogentisic arid (HGA), homovaninic acid (HVA), homovanillyl alco- 
hol (HVOL). homoveratic acid, hva sulfate, hypoxanthine, indole, indole-3-acetic acid, indole-3-lactic arid, kynurenine, 
melatonin, metanephrine. N-methyltryptamine. N-methyltyramine, N,N-dimethyltryptamine. N.N-dimethyltyramine! 
norepinephrine, normetanephrine, octopamine. pyridoxal, pyridoxal phosphate, pyridoxamine, synephrine, tryptophol, 
tryptamine, tyramine. uric acid, vanillylmandelic acid (vma). xanthine and xanthosine. Other suitable compounds are set 

35 forth in. e.g., Jane. I., et al. J. Chrom. 323:191-225 (1985) and Musch. G., et aL, J. Chrom. 348:97-110 (1985). These 
compounds can be incorporated into compounds of formula T-L-X by methods known in the art For example, com- 
pounds having a carboxylic acid group may be reacted with amine, hydroxy!, etc. to form amide, ester and other link- 
ages between T and L 

[0103] In addition to the above properties, and regardless of the intended detection method, it is preferred that the 
40 tag have a modular chemical structure. This aids in the construction of large numbers of structurally related tags using 
the techniques of combinatorial chemistry. For example, the T" 6 group desirably has several properties. It desirably 
contains a functional group which supports a single ionized charge state when the ^-containing moiety is subjected 
to mass spectrometry (more simply referred to as a "mass spec sensitivity enhancer" group, or MSSE). Also, it desirably 
can serve as one member in a family of T ms -containing moieties, where members of the family each have a different 
45 mass/charge ratio, however have approximately the same sensitivity in the mass spectrometer. Thus, the members of 
the family desirably have the same MSSE. In order to allow the creation of families of compounds, it has been found 
convenient to generate tag reactants via a modular synthesis scheme, so that the tag components themselves may be 
viewed as comprising modules. 

[0104] In a preferred modular approach to the structure of the T™ group, T ms has the formula 

50 

T 2 -(J-T 3 -) n - 

wherein T 2 is an organic moiety formed from carbon and one or more of hydrogen, fluoride, iodide, oxygen, nitrogen, 
sulfur and phosphorus, having a mass range of 15 to 500 dartons; T 3 is an organic moiety formed from carbon and one 
55 or more of hydrogen, fluoride, iodide, oxygen, nitrogen, sulfur and phosphorus, having a mass range of 50 to 1000 dal- 
tons; J is a direct bond or a functional group such as amide, ester, amine, sulfide, ether, thioester, disulfide, thioether, 
urea, thiourea, carbamate, thiocarbamate, Schiff base, reduced Schiff base, imine, oxime, hydrazone, phosphate, 
phosphonate, phosphoramide, phosphonamide, sulfonate, sulfonamide or carbon-carbon bond; and n is an integer 



17 



EP 0 992 511 A1 

ranging from 1 to 50. such that when n is greater than 1. each T 3 and J is independently selected. 
[0105] The modular structure T 2 -(J-T 3 )„- provides a convenient entry to families of T-L-X compounds where each 
member of the family has a different T group. For instance, when T is T™. and each family member desirably has the 
same MSSE. one of the T d groups can provide that MSSE structure. In order to provide variability between members 
of a family in terms of the mass of T™. the T 2 group may be varied among family members. For instance one family 
member may have T* = methyl, while another has T 2 = ethyl, and another has T 2 = propyl etc ' 
[0106J In order to provide "gross" or large jumps in mass, a T 3 group may be designed'which adds significant (e a 
one or several hundreds) of mass units to T-L-X. Such a T 3 group may be referred to as a molecular weight range 
adiuster group( WRA"). A WRA is qu.te useful if one is working with a single set of T 2 groups, which will have massS 
extending over a limited range. A single set of T 2 groups may be used to create T™ groups having a wide range of mass 
simply by incorporating one or more WRA T 3 groups into the T™ Thus, using a simple example if a set of T 2 orouns 
affords a mass range of 250-340 daltons for the T™, the addition of a single WRA. having, as an exemplary number 
1 00 dalton. as a T 3 group provides access to the mass range of 350-440 daltons while using the same set of T 2 arouos 
Similarly, the addition of two 100 dalton MWA groups (each as a T 3 group) provides access to the mass range of 450- 
540 daltons. where this incremental addition of WRA groups can be continued to provide access to a very lame mass 
range for the T™ group . Preferred compounds of the formula T 2 -(J.T 3 -) n -L-X have the formula Rvwc^RwraJw Rms SE - 
L-X where VWC ,s a "T 2 " group, and each of the WRA and MSSE groups are "T 3 " groups. This^uctorT!s7lusS 
in Figure 13. and represents one modular approach to the preparation of V" s . 

I ? 1 ° 7 L ^ me to l mU ' a T2 (J ' T3 -)"-- t2 and t3 ar e Preferably selected from hydrocarbyl. hydrocarbyl-O-hydrocarb- 
ylene hydrocarbyl-S-hydrocarbylene. hydrocarbyl-NH-hydrocarbylene. hydrocarbyl-amide-hydrocarbylene N-{hvdro- 
carbyOhydrocarbylene. N.N<Ji(hydrocarbyl)hydrocamylene. hydrocarbylacylhydrocarbylene. heterocyclyl^drocartS 
wherein the heteroatom(s) are selected from oxygen, nitrogen, sulfur and phosphorus, substituted heterocyclylhydro- 
carbyl wherein the heteroatom(s) are selected from oxygen, nitrogen, sulfur and phosphorus and the substtuenteare 
selected from hydrocarbyl. hydrocarbyl-O-hydrocarbylene. hydrocarbyl-NH-hydrocarbylene. hydrocarbyl-S-hvdrocarb- 
ylene. N-fhydrocarbyOhydrocarbylene. N.N^i(hydrocarb y l)hydrocarbylene and hydroramylacyl-hydrocarbyTene In 
addition. T z and/or T 3 may be a derivative of any of the previously listed potential T 2 / T 3 groups, such that one or more 
hydrogens are replaced fluorides. ° 
[01081 Also regarding the formula T 2 -(J-T 3 -) n -. a preferred T 3 has the formula -G(R 2 )-. wherein G is C. 6 alkylene 
chain having a single R 2 substrtuent. Thus, rf G is ethylene (-CH 2 -CH a -) either one of the two ethylene carbonTmay 

T!*, 1 T^JH ? 86 ^ fr ° m a,ky '- a ' kenyl - cycloalkyl. aryl-fused cycloalkyl cycloalkenyl art? 

aralkyl. aryi-substrtuted alkenyl or alkynyl. cycloalkyl-substituted alkyl. cycloalkenyl-substhuted cycloalkyl biaryl a koxy 

SZZFJSF?* . Lfl^f *" a ' ken0Xy ° f alkynOXy - ^amlno. alkenylamino or alkynylamino. aryl-su£ 
stituted alkylam.no. aryl-substrtuted alkenylamino or alkynylamino. aryloxy. arylamino. N-alkylurea-substituted alkyl U- 
arylurea-substrtuted alkyl. alkylcarbonylamino-substrtuted alkyl. aminocarbonyl-substituted alkyl. heterocyclvl hetero- 
cyclyl-substrtuted alkyl. heterocydyl-substituted amino, carboxyalkyl substituted aralkyl. oxocarbocyclyl-fused ar Jand 
heterocyclylalkyl; cydoalkenyl. aryl-substituted alkyl and. aralkyl. hydroxy-substituted alkyl. alkoxy-substituted alkT 
aralkoxy-substrtuted alkyl. alkoxy-substituted alkyl. aralkoxy-substituted alkyl. amino-substituted alkyl (aryl-substtS 
alkyloxycarbonylamino)-substituted alkyl. thiol -substituted alkyl. alkylsutfonyl-substituted alkyl. (hydroxy-substitoted 
alky.th.o)-substituled alkyl. thioalkoxy-substrtuted alkyl. hydrocarbylacylamino-substituted alkyl. heterWSacylarS 
substrtuted alkyl. hydrocarbyl-substituted-heterocyclylacylamino-substituted alkyl. alkylsutfonylamino-substrtuted alkyl 
arylsulfonylam.no-substituted alkyl, morpholino-alkyl. thiomorpholino-alkyl. morpholino carbonyl-substituted alkyl mio- 
morpho.inoc^bony.-substituted alkyl. [N-(alky.. alkenyl or alkyny.)- or N.N-[dialkyl. dialkenyl. dialkynyl or (alkyl. alkenyt 
aminojcarbonyl-substrtuted alkyl. heterocyclylaminocarbonyl, heterocylylalkyleneaminocarbonyl. heterocydylaminoc- 
M^I?"^f alky1, ^^^^y^eneaminocarbonyl-substituted alkyl. N.N-[dialkyl]alkyleneaminocarbonyl 
N.N-[d ia ^yGa.kylen e am,nocarbonyl-substituted alkyl. alkyl-substituted heterocyclylcarbonyl. alkyl-substituted heter£ 
cydylcarbonyl-alkyl. carboxyl-substituted alkyl. dialkylamino-substituted acylaminoalkyl and amino acid side chains 
selected from arginine, asparagine. glutamine. S-methyl cysteine, methionine and corresponding sulfoxide and sulfone 
derivatives hereof, glycine, leucine, isoleucine. allo-isoleucine, tert-leucine, norleucine. phenylalanine, tyrosine tryp- 
tophan, proline, alanine, ornithine, histidine. glutamine. valine, threonine, serine, aspartic acid, beta^yanoalanine and 
a lothreon.ne; alyny and heterocyclylcarbonyl. aminocarbonyl. amido. mono- or dialkylaminocarbonyl, mono- or'diar- 
ylaminocarbonyl. alkylarylaminocarbonyl. diarylaminocarbonyl. mono- or diacylaminocarbonyl aromatic or aliohatic 
acyl alkyl optionally substituted by substituents selected from amino, carboxy. hydroxy. ' mercapto mono- or 
d.alkylam.no. mono- or diarylamino. alkylarylamino. diarylamino. mono- or diacylamino. alkoxy. alkenoxy aryloxy thio- 
alkoxy. thioalkenoxy. thioalkynoxy. thioaryloxy and heterocyclyl. y *' 

[01 09] A preferred compound of the formula T 2 -(J-T 3 -) n -L-X has the structure: 



18 



a a: ) 

EP0 992 511 A1 



O 

r 



T 4 
I 

Amide 
I 

(CH,) C 



70 



wherein G is (Cl-y^g such that a hydrogen on one and only one of the CH 2 groups represented by a single "G" is 
15 replaced wrth-fCH^-Amide-T 4 ; T 2 and T 4 are organic moieties of the formula Cv^No-gOo-gH^ sucn that the sum of 
a and p is sufficient to satisfy the otherwise unsatisfied valencies of the C, N, and O atoms; amide is 



20 



35 



o o 
II II 

— N-C — or — C-N — ; 

R 1 R 1 



25 R 1 is hydrogen or 0, . 10 alkyl; c is an integer ranging from 0 to 4; and n is an integer ranging from 1 to 50 such that when 
n is greater than 1, G, c. Amide, R 1 and T 4 are independently selected. 

[01 10] In a further preferred embodiment, a compound of the formula T 2 -(J-T 3 -) n -L-X has the structure: 



30 T 4 



I 

Amide 



O ^ H 2>c r 1 o 



R > O (CH 2 ) C 
40 Amide 

wherein T 5 is an organic moiety of the formula C 1 . 25 N 0 . 9 Oo.9H a Fp such that the sum of a and p is sufficient to satisfy 
45 the otherwise unsatisfied valencies of the C, N. and O atoms; and T 5 includes a tertiary or quaternary amine or an 
organic acid; m is an integer ranging from 0-49, and T 2 , T 4 , R 1 , L and X have been previously defined. 
[01 11 ] Another preferred compound having the formula T 2 -(J-T 3 -) n -L-X has the particular structure: 



50 



55 



19 



1> 

EP 0 992 511 A1 



10 




.X 



15 



20 



25 



30 



wherein T« ,s an organic moiety of the formula C^^O^H^ such that the sum of a and p is sufficient to satisfy 
the otherwise unsat.sf,ed valencies of the C. N. and O atoms; and T* includes a tertiary or quaternary amine ^ 
organ,c acid; m ,s an .nteger ranging from 0-49. and T*. T 4 , c. R\ "Amide". L and X have been previously defined 
[0112] In the above structures that have a I* group. -Amide-T* is preferably one of the following, which are conven- 
iently made by reacting organic acids with free amino groups extending from "G": conven 



-NHC 
II 
O 



N 
I 

(C,- 



C,o) 



— NHC 



NHC-(C, — C, 0 )-N 
O ^ 




, N, 



-NHC-(C 0 -C 10 )-|^ . 



35 



NHC 
It 
O 




N-(C,— C,o); and 



— NHC-(C-C, 0 )-N- 
O 



[01 1 3] Where the above compounds have a T 5 group, and the "G" group has a free carboxyl group (or reactive 
equ.va.ent thereof), then the foHowing are preferred -Amide-T* group, which may convenientty^r^ by eaSo 
40 the appropnate organic amine with a free carboxyl group extending from a "G" group: 9 



45 



50 



55 



20 



0.3 



CD 



EP 0 992 511 A1 



-CNH-(C,-C l0 )-H^J) ; 



-CNH-(C,-C l0 )-H^Al 



10 



IS 



20 



25 




-CNH-(C,— C,o 



(Ci— C, 0 ) 
-CNH-(C 2 -C, 0 >-N^) ; 



-CNH— (C r -C, 0 )-N(C 1 -C lo ) 2 ; 
O 



-CN N(C,— C 10 ) ; and 
O ^ — ' 



-CNH-(C 2 — C,o)-N O ; 
O 




-CNH-(C,-C I0 



—cnh-(c 2 -c 10 )-n: 



^i— C I0 ) 



II 

o 



NH 




30 [0114] In three preferred embodiments of the invention. T-L-MOI has the structure: 



35 



AO 



AS 



SO 




O (CH 2 ) C R' °V Nn (C,- C,o)— ODN-3-OH 



NO, 



or the structure: 



55 



21 



2> D 

EP 0 992 511 A1 



T 4 
I 

Amide 
(CH2) C H 



T 2 ' 

'° HO J. NO, 



»5 




H 

(C| C, 0 ) — ODN — 3— OH 



or the structure: 



20 



25 



30 




H 

N 

X (C,— C J0 >— ODN-3— OH 



35 



IfS to 21 th!^f mo,et,es °*^ e for ™ la Ci-ssNo-sOo-sSo-aPo-aHaF^ such that the sum of a. p. and 8 is 
suffraent to satisfy the otherw.se unsabsf.ed valencies of the C. N. O. S and P atoms: G is (CKJ, 6 wherein one and 
only one hydrogen on the CH 2 groups represented by each G is replaced with -(CH^-Amide-"^ Am£e is 



40 



o o 
II II 

— N-C or — C-N — ; 

I 1 ' i 

R 1 R 1 



R is hydrogen or C M0 alkyl.c is an integer ranging from 0 to 4; -C 2 -C 10 " represents a hydrocarbylene group havina 
from 2 to 10 carbon atoms. "ODN-3--OH- represents a nucleic acid fragment having a terminal 3" hydro^ToupTe 
<s a nuc.e.c ac^ fragment jo-ned to (C, -C 10 ) at other than the 3' end of the nucleic acid fragment); and n is anVnfeZran^ 
ng from 1 to 50 such that when n is greater than 1 . then G. c. Amide. R 1 and T 4 are independently selected Prel^l 
there are not three Mmom. bonded to a single carbon atom, wherein T 2 and T* are organic 

and O ° s ££r EiST! h " ° ^ f iS SUf,ident ,0 S3tiSfy the olh «*- unsatis,ied of the C N 

so SSJS^SaJT" one on,y one hydrosen on CH * 9roups represented * each g * 



o o 
II II 

— N-C — or — C-N — ; 
R 1 R 



22 



EP0 992 511 A1 

R 1 is hydrogen or C^q alkyl; c is an integer ranging from 0 to 4; *ODN-3'-OI-r represents a nucleic acid fragment hav- 
ing a terminal 3* hydroxyl group; and n is an integer ranging from 1 to 50 such that when n is greater than 1 , G. c, Amide. 
R 1 and T 4 are independently selected. 

[01 1 5] In structures as set forth above that contain a T 2 -C(=0)-N(R 1 )- group, this group may be formed by reacting 
an amine of the formula HN(R 1 )- with an organic acid selected from the following, which are exemplary only and do not 
constitute an exhaustive list of potential organic acids: Formic acid. Acetic acid. Propiolic acid. Propionic acid, Fluoro- 
acetic acid. 2-Butynoic acid. Cyclopropanecarboxylic acid, Butyric acid. Methoxyacetic acid. Difluoroacetic acid. 4-Pen- 
tynoic acid. Cyclobutanecarboxylic acid. 3.3-Dimethylacrylic acid. Valerie acid. N.N-Dimethylglycine, N-Fomryl-Gly-OH. 
Ethoxyacetic acid. (Methylthio)acetic acid. Pyrrole-2-carboxylic acid. 3-Furoic acid, lsoxazole-5-carboxyiic acid, trans- 
3-Hexenoic acid. Trifluoroacetic acid. Hexanoic acid. Ac-Gly-OH. 2-Hydroxy-2-methy(butyric acid, Benzoic acid. Nico- 
tinic acid. 2-Pyrazinecarboxylic acid, 1-Methyl-2-pyrrolecarboxylic acid. 2-Cyclopentene-l -acetic acid. Cyclopentylace- 
tic acid. (S)-(-)-2-Pyrrolidone-5-carboxylic acid. N-Methyl-L-proline. Heptanoic acid. Ac-b-Ala-OH. 2-Ethyl-2- 
hydroxybutyric acid. 2-(2-Methoxyethoxy)acetic acid. p-Toluic acid. 6-Methylnicotinic acid. 5-Methyl-2-pyrarinecarboxy- 
lic acid. 2,5-Dimethylpyrrole-3-carboxylic acid. 4-Fluorobenzoic acid. 3.5-Dimethylisoxazole-4-carbGxylic acid. 3- 
Cyclopentyipropionic acid. Octanoic acid. N.N-Dimethylsuccinamic acid. Phenylpropiolic acid. Cinnamic acid. 4-Ethyl- 
benzoic acid. p-Anisic acid, 1.2.5-Trimethylpyrrole-3-carboxylic acid. 3-Ruoro-4-methylbenzoic acid. Ac-DL-Propar- 
gylglycine. 3-(Trifluoromethyl)butyric acid. 1-Piperidinepropionic acid. N-Acetylproline. 3.5-Difluoroberaoic acid. Ac-L- 
Val-OH, lndole-2-carboxylic acid. 2-Benzofurancarboxylic acid, Berizotriazole-5-carboxylic acid, 4-n-Propytoenzoic 
acid, 3-Dimethylaminobenzoic acid, 4-Ethoxybenzoic acid. 4-(Methyrihio)benzoic acid. N-(2-Furoyl)glycine. 2-(Methyl- 
thio)nicotinic acid. 3-Fluoro-4-methoxybenzoic acid. Tfe-Gly-OH, 2-Napthoic acid. Quinaldic acid. Act-L-lle-OH, 3- 
Methylindene-2-carboxylic acid. 2-Quinoxalinecarboxylic acid. 1 -Methylindole-2.carboxylic acid, 2,3,6-Trifluorabenzoic 
acid. N-Formyl-L-Met-OH, 2-[2-(2-Methoxyethoxy)ethoxy] acetic acid. 4-n-Butylbenzoic acid. N-Benzoylglycine. 5-Fluor- 
oindole-2-carboxylic acid. 4-n-Propoxybenzoic acid. 4-Acetyl-3.5<iimethyl-2-pyrrolecarboxylic acid. 3.5-Dimethoxyben- 
zoic acid. 2. 6-Dimethoxy nicotinic acid. Cyclohexanepentanoic acid. 2-Naphthylacetic acid. 4-<1H-Pyrrol-1-yl)benzoic 
acid, lndole-3-propionic acid. m-Trifluoromethylbenzoic acid. 5-Methoxyindole-2-carboxylic acid. 4-Pentylbenzoic acid. 
Bz-b-Ala-OH. 4-Diethylaminobenzoic acid. 4-n-Butoxybenzoic acid. 3-Methyl-5-CF3-isoxazole-4-carboxylic acid. (3 4- 
Dimethoxyphenyl)acetic acid. 4-Biphenylcarboxylic acid. Pivaloyl-Pro-OH. Octanoyl-Gly-OH. (2-Naphthoxy)acetic acid 
lndole-3-butyric acid. 4-(TrifluoromethyDphenylacetic acid. 5-Methoxyindole-3-acetic acid. 4-(Trifluoromethoxy)benzoic 
acid. Ac-L-Phe-OH. 4-Pentyloxybenzoic acid. 2-GIy-OH. 4-Carboxy-N-(fur-2-ylmethyl)pyrrolidin-2-one. 3.4-Diethoxy- 
benzoic acid. 2,4-Dimethyl-5-C02Et-pyrrole-3-carbQxylic acid. N-(2-Fruorophenyl)succinamic acid. 3,4,5-Trimethoxy- 
benzoic acid, N-Phenylanthranilic acid. 3-Phenoxybenzoic acid. Nonanoyl-Gly-OH. 2«Phenoxypyridine-3-carboxylic 
acid. 2.5-Dimethyl-1-phenylpyrrole-3-carboxylic acid. trans-4-(Trifluoromethyl)cinnamic acid. (5-Methy!-2-phenyloxazol- 
4-yl)acetic acid. 4-(2-Cyclohexenyloxy)benzoic acid. 5-Methoxy-2-methylindole-3 -acetic acid. trans-4-Cotininecarboxy- 
lic acid. Bz-5-Aminovaleric acid. 4-Hexyloxybenzoic acid. N-(3-Methoxyphenyl)succinamic acid. 2-Sar-OH 4-(3 4- 
Dimethoxyphenyl)butyric acid. Ac-o-Fluoro-DL-Phe-OH, N-(4-Ruorophenyl)glutaramic acid. 4'-Ethyl-4-biphenylcarbox- 
yhc acid. 1.2.3.4-Tetrahydroacridinecarboxylic acid. 3-Phenoxyphenylacetic acid. N-(2.4-Difluoropheny I) succinate 
acid. N-Decanoyl-GIy-OH, (+)-6-Methoxy-a-methyl-2-naphthaleneacetic acid. 3-(Trifluoromethoxy)cinnamic acid. N- 
Formyl-DL-Trp-OH. (R)-(+)-a-Methoxy-a-(trifluoromethyl)phenylacetic acid. Bz-DL-Leu-OH. 4-(Trifluoromethoxy)phe- 
noxyacetic acid. 4-Heptyloxybenzoic acid, 2.3.4-Trimethoxycinnamic acid. 2,6-Dimethoxybenzoyl-Gly-OH. 3-(3,4,5-Tri- 
methoxyphenyl)propionic acid, 2,3,4,5,6-Pentafluorophenoxyacetic acid. N-(2,4-Difiuorophenyl)g»utaramic acid. N- 
Undecanoyt-Oy-OH. 2-(4-Fluorobenzoy!) benzoic acid. 5-Trifluoromethoxyindole-2-carboxylic acid. N-(2,4-Drfluorophe- 
nyf)diglycolamic acid. Ac-L-Trp-OH. Tfa-L-Phenylglycine-OH, 3-lodobenzoic acid. 3-(4-n-Pentylbenzoyl)propionic acid 
2-Phenyl-4-quinolinecarboxylic acid. 4-Octyloxybenzoic acid. Bz-L-Met-OH, 3.4.5-Triethoxybenzoic acid. N-Lauroyl- 
Gly-OH. 3,5-Bis(trHluoromethyl)benzoic acid. Ac-5-Methyl-DL-Trp-OH. 2-lodophenylacetic acid. 3- lodo-4-methy ben- 
zoic acid, 3-(4-n-Hexylbenzoyl)propionic acid. N-Hexanoyl-L-Phe-OH, 4-Nonyloxybenzoic acid, 4'-(Trifluoromethyl)-2- 
biphenylcarboxylic acid, Bz-L-Phe-OH. N-Tridecanoyl-Gly-OH, 3.5-Bis(trifluoromethyl)phenylacetic acid. 3-(4-n-Heptyl- 
benzoyl)propionic acid. N-Hepytanoyl-L-Phe-OH. 4-Decyloxybenzoic acid. N-(a,a.a-trifluoro-m-tolyl)anthranilic acid 
Niflumic acid. 4-(2-Hydroxyhexafluoroisopropyl)benzoic acid. N-Myristoyl-Gly-OH, 3-(4-n-Octylbenzoyl)propionic acid 
N-Octanoyl-L-Phe-OH. 4-Undecyloxybenzoic acid. 3-(3.4.5-Trimethoxyphenyl)propionyl-Gly-OH, 8-lodonaphthoic acid 
N-PentadecanoyM3ly-OH, 4-Dodecyloxybenzoic acid. N-Palmitoyl-GlyOH. and N-Stearoyl-Gly-OH. These organic 
acids are available from one or more of Advanced ChemTech. Louisville, KY; Bachem Bioscience Inc., Torrance, CA; 
Calbiochem-Novabiochem Corp.. San Diego. CA; Farchan Laboratories Inc., Gainesville FL; Lancaster Synthesis 
Windham NH; and MayBridge Chemical Company (c/o Ryan Scientific). Columbia. SC. The catalogs from these com- 
panies use the abreviations which are used above to identify the acids. 

f. Combinatorial Chemistry as a Means for Preparing Tags 

[01 1 6] Combinatorial chemistry is a type of synthetic strategy which leads to the production of large chemical Ibrar- 



23 



.X» D 

EP 0 992 511 A1 

ies (see. for example. PCT Application Publication No. WO 94/08051). These combinatorial libraries can be used as 
tags tor the identification of molecules of interest (MOIs). Combinatorial chemistry may be defined as the systematic 
and repetitive, covalent connection of a set of different "building blocks" of varying structures to each other to yield a 
large array of diverse molecular entities. Building blocks can take many forms, both naturally occurring and synthetic 
such as nucleophiles. electrophiles. dienes. alkylating or acylating agents, diamines, nucleotides, amino acids sugars' 
lipids, organic monomers, synthons. and combinations of the above. Chemical reactions used to connect the' building 
blocks may involve alkylation. acylation. oxidation, reduction, hydrolysis, substitution, elimination, addition cyclization 
condensation, and the like. This process can produce libraries of compounds which are oligomeric. non-oiigomeric or 
combinations thereof If oligomeric. the compounds can be branched, unbranched. or cyclic. Examples of oligomeric 
structures which can be prepared by combinatorial methods include oligopeptides, oligonucleotides, oligosaccharides 
polyhpids. polyesters, polyamides, polyurethanes. polyureas. polyethers. poly(phosphorus derivatives) eg phos- 
phates, phosphonates, phosphoramides. phosphonamides. phosphites, phosphinamides. etc.. and poMsulfur deriva- 
tives), e.g., sulfones. sulfonates, sulfites, sulfonamides, sulfenamides. etc. 

[01 1 7] One common type of oligomeric combinatorial library is the peptide combinatorial library Recent innovations 
in peptide chem.stry and molecular biology have enabled libraries consisting of tens to hundreds of millions of different 
peptide sequences to be prepared and used. Such libraries can be divided into three broad categories One category 
of libraries involves the chemical synthesis of soluble non-support-bound peptide libraries (e.g. . Houghten et al Nature 
354:84. 1991). A second category involves the chemical synthesis of support-bound peptide libraries, presented on 
solid supports such as plastic pins, resin beads, or cotton (Geysen et al.. Mol. Immunol. 23 709 1986- Lam et al 
Nature 354.82. 1991; Eichler and Houghten. Biochemistry 32:t 1035. 1993). In these first two categories,' the building 
blocks are typically L-amino acids. D-amino acids, unnatural amino acids, or some mixture or combination thereof A 
third category uses molecular biology approaches to prepare peptides or proteins on the surface of filamentous phage 
particles or plasm.ds (Scott and Craig. Curr. Opinion Biotech. 5:40. 1994). Soluble, nonsupport-bound peptide libraries 
appear to be suitable tor a number of applications, including use as tags. The available repertoire of chemical diversities 
97 P lTl38 1994) SS ^ 6XPanded by S,epS SUCh 38 (Ostresh et al.. Proc. Natl. Acad. Sci., USA 

[01 1 8] Numerous variants of peptide combinatorial libraries are possible in which the peptide backbone is modified 
and/or the amide bonds have been replaced by mimetic groups Amide mimetic groups which may be used include 
ureas, urethanes, and carbonylmethylene groups. Restructuring the backbone such that sidechains emanate from the 
am.de nitrogens of each amino acid, rather than the alpha-carbons, gives libraries of compounds known as peptoids 
(Simon et al.. Proc. Natl. Acad Sci., USA 39:9367. 1992). peproios 

ff 1 IfL 'il? 0 ? 6 ' common type of ol '9° m eric combinatorial library is the oligonucleotide combinatorial library where 
the building blocks are some form of naturally occurring or unnatural nucleotide or polysaccharide derivatives including 
where various organic and inorganic groups may substitute for the phosphate linkage, and nitrogen or sulfur mav sub- 
stitute for oxygen in an ether linkage (Schneider et al.. Biochem. 34:9599. 1995; Freier et al J. Med Chem 33 344 
^ 5 'o F JT^ J \ B ^ Chn0 ' 09y 4 1 259 - 1 99S: Scnneider et Published PCT WO 942052; Ecker et al.. Nucleic Acids 

M6S. 7.1 853, 1993). 

[0120] More recently, the combinatorial production of collections of non-oligomeric. small molecule comnounds has 
been described (DeWitt et a... Proc. Natl. Acad Sci, USA 90:690. 1993; Bunin et a... Proc. Natl ^ ScT USA 
97:4708. 1994). Structures suitable for elaboration into small-molecule libraries encompass a wide variety of oraanic 
molecules, tor example heterocyclics, aromatics. alicyclics. aliphatics. steroids, antibiotics, enzyme inhibitors ligands 
hormones, drugs, alkaloids, opioids, terpenes. porphyrins, toxins, catalysts, as well as combinations thereof ' 

g. Specific Methods lor Combinatorial Synthesis of Tags 

! 01 k 1 I I W !, meth ,°? S ^ r thS 0 re ? aration ^ use of a diverse set of amine-containing MS tags are outlined below 
n both methods, solid phase synthesis is employed to enable simultaneous parallel synthesis of a large number of 
agged linkers, using the techniques of combinatorial chemistry. In the first method, the eventual cleavage of the too 
from the oligonucleotide results in liberation of a carboxyl amide. In the second method, cleavage of the tag produces 
a carboxylic acid. The chemical components and linking elements used in these methods are abbreviated as follows: 

R = resin 

FMOC = fluorenylmethoxycarbonyl protecting group 

All = allyl protecting group 

CO z H = carboxylic acid group 

CONH 2 = carboxylic amide group 

NH 2 = amino group 

OH = hydroxy! group 



24 



CONH 
COO 

NH 2 - Rink - C0 2 H 
OH * 1MeO- C0 2 H 
OH - 2MeO - C0 2 H 
NH 2 -A-COOH 
X1....Xn-COOH 
oligol... oligo(n) 
HBTU 



a) 



EP0 992 511 A1 



= amide linkage 
= ester linkage 

= 4-[(a-amino)-2,4<limethoxybenzyl]- phenoxybutyric acid (Rink linker) 

= (4-hydroxymethyl)phenoxybutyric acid 

= (4-hydroxymethyl-3-methoxy)phenoxyacetic acid 

= amino acid with aliphatic or aromatic amine functionality in side chain 

= set of n diverse carboxylic acids with unique molecular weights 

= set of n oligonucleotides 

= O-benzotriazol-l-yl-N.N.N'.N'-tetramethyluronium hexafluorophosphate 



The sequence of steps in Method 1 is as follows: 



OH-2MeO-CONH-R 



i FMOC - NH - Rink - C0 2 H; couple (e.g., HBTU) 
FMOC - NH - Rink - COO - 2MeO - CONH - R 
I piperidine (remove FMOC) 



NH 2 - Rink - COO - 2MeO - CONH - R 



25 



EP 0 992 511 A1 

i FMOC - NH - A - COOH; couple (e.g., HBTU) 
FMOC - NH - A - CONH - Rink - COO - 2MeO - CONH - R 

I piperidinc (remove FMOC) 

NH 2 - A - CONH - Rink - COO - 2MeO - CONH - R 

4 divide into n aliquots 
44444 couple to n different acids XI ™ Xn - COOH 

XI Xn - CONH - A - CONH - Rink - COO- 2MeO - CONH - R 

ii444 Cleave tagged linkers from resin with 1% TFA 

XI Xn - CONH - A -CONH - Rink - C0 2 H 

44444 couple to n oligos (oligol oligo(n)) 

(e.g., via Pfp esters) 

XI Xn - CONH - A - CONH - Rink - CONH - oligol oligo(n) 

A pool tagged oligos 

4 perform sequencing reaction 

4 separate different length fragments from 

sequencing reaction (e.g. , via HPLC or 
4 cleave tags from linkers with 25%- 1 00% TFA 

XI Xn - CONH - A - CONH 

analyze by mass spectrometry 

The sequence of steps in Method 2 is as follows: 

OH - lMeO-CO,- All 

4, FMOC - NH - A - C0 2 H; couple (e.g., HBTU) 
FMOC - NH - A - COO - lMeO - C0 2 - All 
I Palladium (remove Allyl) 



26 



o . a- 

EP0 992 511 A1 



FMOC - NH - A - COO - 1 MeO - C0 2 H 

I OH - 2MeO - CONH - R; couple (e.g., HBTU) 
FMOC - NH - A - COO - lMeO * COO - 2MeO - CONH - R 

4 piperidine (remove FMOC) 

NH 2 - A - COO - lMeO - COO - 2MeO - CONH - R 

4- divide into n aliquots 

444ii4 couple to n different acids XI ..... Xn - C0 2 H 

XI Xn - CONH - A - COO - IMeO - COO - 2MeO - CONH - R 

cleave tagged linkers from resin with 1% TFA 

XI Xn - CONH - A - COO - IMeO - C0 2 H 

iii^i couple to n oligos (oligol oligo(n)) 

(e.g., via Pfp esters) 

XI Xn - CONH - A - COO - IMeO - CONH - oligol oligo(n) 

I pool tagged oligos 

^ perform sequencing reaction 

4 separate different length fragments from 

sequencing reaction (e.g., via HPLC or CE) 
I cleave tags from linkers with 25-1 00% TFA 

XI Xn-CONH-A-C0 2 H 

analyze by mass spectrometry 



2. Linkers 

[0122] A "linker" component (or L), as used herein, means either a direct covalerrt bond or an organic chemical 
group which is used to connect a "tag" (or T) to a "molecule of interest" (or MOI) through covalerrt chemical bonds. In 
addition, the direct bond itself, or one or more bonds within the linker component is cleavable under conditions which 
allows T to be released (in other words, cleaved) from the remainder of the T-L-X compound (including the MOI compo- 
nent). The tag variable component which is present within T should be stable to the cleavage conditions. Preferably, the 
cleavage can be accomplished rapidly; within a few minutes and preferably within about 15 seconds or less, 
[0123] In general, a linker is used to connect each of a large set of tags to each of a similarly large set of MOIs. 
Typically, a single tag-linker combination is attached to each MOI (to give various T-L-MOI). but in some cases, more 
than one tag-linker combination may be attached to each individual MOI (to give various (T-L)n-MOi). In another embod- 
iment of the present invention, two or more tags are bonded to a single linker through multiple, independent sites on the 



27 



D 2) 

EP 0 992 511 A1 

linker, and this multiple tag-linker combination is then bonded to an individual MOI (to give various (T)n-L-MOI). 
[01 24] After various manipulations of the set of tagged MOIs, special chemical and/or physical conditions are used 
to cleave one or more covalent bonds in the linker, resulting in the liberation of the tags from the MOIs. The cleavable 
bond(s) may or may not be some of the same bonds that were formed when the tag. linker, and MOI were connected 
together. The design of the linker will, in large part determine the conditions under which cleavage may be accom- 
plished. Accordingly, linkers may be identified by the cleavage, conditions they are particularly susceptible too. When a 
linker is photolabile (/.e., prone to cleavage by exposure to actinic radiation), the linker may be given the desianation 
l>\ Likewise, the designations L«« . L* 0 J L". L*~ L* and L** may be used to refer to linkers that are par- 
ticularly susceptible to cleavage by acid. base, chemical oxidation, chemical reduction, the catalytic activity of an 
enzyme (more simply "enzyme"), electrochemical oxidation or reduction, elevated temperature ("thermal") and thiol 
exchange, respectively. 

[0125J Certain types of linker are labile to a single type of cleavage condition, whereas others are labile to several 
types of cleavage conditions. In addition, in linkers which are capable of bonding multiple tags (to give (T)n-L-MOI type 
structures), each of the tag-bonding sites may be labile to different cleavage conditions. For example, in a linker having 
two tags bonded to it. one of the tags may be labile only to base, and the other labile only to photolysis. 
[0126] A linker which is useful in the present invention possesses several attributes: 

1) The linker possesses a chemical handle (Lh) through which it can be attached to an MOI. 

2) The linker possesses a second, separate chemical handle (Lt,) through which the tag is attached to the linker If 
multiple tags are attached to a single linker ((T)n-L-MOI type structures), then a separate handle exists for each tag 

3) The linker is stable toward all manipulations to which it is subjected, with the exception of the conditions which 
allow cleavage such that a T-containing moiety is released from the remainder of the compound, including the MOI 
Thus, the linker is stable during attachment of the tag to the linker, attachment of the linker to the MOI and any 
manipulations of the MOI while the tag and linker (T-L) are attached to it. 

4) The linker does not significantly interfere with the manipulations performed on the MOI while the T-L is attached 
to it. For instance, if the T-L is attached to an oligonucleotide, the T-L must not significantly interfere with any hybrid- 
ization or enzymatic reactions (e.g.. PCR) performed on the oligonucleotide. Similarly, if the T-L is attached to an 
antibody, it must not significantly interfere with antigen recognition by the antibody. 

5) Cleavage of the tag from the remainder of the compound occurs in a highly controlled manner, using physical or 
chemical processes that do not adversely affect the detectability of the tag. 

[0127] For any given linker, it is preferred that the linker be attachable to a wide variety of MOIs. and that a wide 
variety of tags be attachable to the linker. Such flexibility is advantageous because it allows a library of T-L conjugates 
once prepared, to be used with several different sets of MOIs. 
[0128] As explained above, a preferred linker has the formula 

U-L 1 -L 2 -L 3 -L* 

wherein each U is a reactive handle tat can be used to link the linker to a tag reactant and a molecule of interest reac- 
tant. L is an essential part of the linker, because L 2 imparts lability to the linker. L 1 and L 3 are optional groups which 
effectively serve to separate L 2 from the handles L+,. 

[0129] L 1 (which, by definition, is nearer to T than is L 3 ), serves to separate T from the required labile moiety L 2 
This separation may be useful when the cleavage reaction generates particularly reactive species (eg free radicals) 
which may cause random changes in the structure of the T-containing moiety. As the cleavage site is further separated 
from the T-containing moiety, there is a reduced likelihood that reactive species formed at the cleavage site will disrupt 
the structure of the T-containing moiety. Also, as the atoms in L1 will typically be present in the T-containing moiety 
thes ® L & atoms mav ,m P art a desirable quality to the T-containing moiety. For example, where the T-containing moiety^ 
is a T™ -containing moiety, and a hindered amine is desirably present as part of the structure of the ^-containing moi- 
ety (to serve, e.g., as a MSSE). the hindered amine may be present in L 1 labile moiety. 

[01 30] In other instances. L 1 and/or L 3 may be present in a linker component merely because the commercial sup- 
plier of a linker chooses to sell the linker in a form having such a L 1 and/or L 3 group. In such an instance there is no 
harm in using linkers having L and/or L 3 groups, (so long as these group do not inhibit the cleavage reaction) even 
though they may not contribute any particular performance advantage to the compounds that incorporate them Thus 
the present invention allows for L 1 and/or L 3 groups to be present in the linker component. 

[0131] L 1 and/or L 3 groups may be a direct bond (in which case the group is effectively not present), a hydrocarb- 
ylene group {e.g., alkylene. arylene, cycloalkylene. etc.). -O-hydrocarbylene (e.g., -0-CH 2 -. 0-CH 2 CH(CH 3 )- etc ) or 
nydrocarbylene-(0-nydrocarbylene) w - wherein w is an integer ranging from 1 to about 10 (eg -CHo-O-Ar- -CHMO- 
CH 2 CH 2 ) 4 -. etc.). 2 



28 



a cp 

EP0 992 511 A1 

[0132] With the advent of solid phase synthesis, a great body of literature has developed regarding linkers tat are 
labile to specific reaction conditions. In typical solid phase synthesis, a solid support is bonded through a labile linker to 
a reactive site, and a molecule to be synthesized is generated at the reactive site. When the molecule has been com- 
pletely synthesized, the solid support-linker-molecule construct is subjected to cleavage conditions which releases the 
c molecule from the solid support The labile linkers which have been developed for use in this context (or which may be 
used in this context) may also be readily used as the linker reactant in the present invention. 

10133] Lloyd- Williams, P., et al.. "Convergent Solid-Phase Peptide Synthesis". Tetrahedron Report No. 347. 
49(A8):1 1065-1 1133 (1993) provides an extensive discussion of linkers which are labile to actinic radiation (/.a, pho- 
tolysis), as well as acid, base and other cleavage conditions. Additional sources of information about labile linkers are 
w well known in the art. 

[01 34] As described above, different linker designs will confer cleavability ("lability") under different specific physical 
or chemical conditions. Examples of conditions which serve to cleave various designs of linker include acid, base, oxi- 
dation, reduction, fluoride, thiol exchange, photolysis, and enzymatic conditions. 

[0135] Examples of cleavable linkers that satisfy the general criteria for linkers listed above will be well known to 
15 those in the art and include those found in the catalog available from Pierce (Rockford, IL). Examples include: 

ethylene glycobis(succinimidylsuccinate) (EGS). an amine reactive cross-linking reagent which is cleavable by 
hydroxylamine (1 rVI at 37*C for 3-6 hours); 

disuccinimidyl tartarate (DST) and sulfo-DST, which are amine reactive cross-linking reagents, cleavable by 0.015 
20 M sodium periodate; 

bis[2-(succinimidyloxycarbonyloxy)ethyl]sulfone (BSOCOES) and sulfo-BSOCOES, which are amine reactive 
cross-linking reagents, cleavable by base (pH 1 1 .6); 

1,4-di-[3'-(2'-pyridytdithio(propionamido))butane (DPDPB), a pyridyldithiol crosslinker which is cleavable by thiol 
exchange or reduction; 

25 • N-[4-(p-azidosalicylamido)-butyl]-3 , -(2 , -pyridydithio)propionamide (APDP). a pyridyldithiol crosslinker which is 
cleavable by thiol exchange or reduction; 

bis-[beta-4-(azidosalicylamido)ethyl]-disulfide. a photoreactive crosslinker which is cleavable by thiol exchange or 
reduction; 

N-succinimidyl-(4-azidophenyl)-1 ,3'dithiopropionate (SADP), a photoreactive crosslinker which is cleavable by thiol 
30 exchange or reduction; 

sulfosucdnimidyl-a-fT-azido^-methylcoumarin-S-acetamideJethyl-I.S^ithiopropionate (SAED). a photoreactive 
crosslinker which is cleavable by thiol exchange or reduction; 

sulfosucdnimidy!-2-(rrvazidoK)-nitrobenzamido)-ethyl-1,3'dithiopropionate (SAND), a photoreactive crosslinker 
which is cleavable by thiol exchange or reduction. 

35 

[01 36] Other examples of cleavable linkers and the cleavage conditions that can be used to release tags are as fol- 
lows. A silyl linking group can be cleaved by fluoride or under acidic conditions. A 3-, 4-. 5-, or 6-substrtuted-2-nitroben- 
zyloxy or 2-. 3-. 5-, or 6-substituted-4-nitrobenzy1oxy linking group can be cleaved by a photon source (photolysis). A 3- 
, 4-, 5-, or 6-substrtuted-2-alkoxyphenoxy or 2-, 3-. 5-. or 6-substituted-4-alkoxyphenoxy linking group can be cleaved 
40 by Ce(NH 4 ) 2 (N0 3 ) 6 (oxidation). A NC0 2 (urethane) linker can be cleaved by hydroxide (base), acid, or LiAIH 4 (reduc- 
tion). A 3-pentenyl. 2-butenyl, or 1-butenyl linking group can be cleaved by 0 3 , 0 S 04/I0 4 ~, or KMn0 4 (oxidation). A 2- 
[3-, 4-, or 5-substituted-furyl]oxy linking group can be cleaved by 0 2 . Br 2l MeOH, or acid. 

[01 37] Conditions for the cleavage of other labile linking groups include: t-alkyioxy linking groups can be cleaved by 
acid; methyl (dialkyl)methoxy or 4-substrtuted-2-alkyM,3-dioxlane-2-yl linking groups can be cleaved by H 3 0+; 2- 

45 silylethoxy linking groups can be cleaved by fluoride or acid; 2-(X)-ethoxy (where X = keto. ester amide, cyano. N0 2 . 
sulfide, sulfoxide, sulfone) linking groups can be cleaved under alkaline conditions; 2*. 3-, 4-, 5-. or 6-substituted-ben- 
zyloxy linking groups can be cleaved by acid or under reductive conditions; 2-butenyloxy linking groups can be cleaved 
by (Ph 3 P) 3 RhCI(H), 3-, 4-, 5-, or 6-substituted-2-bromophenoxy linking groups can be cleaved by Li, Mg, or BuLi; meth- 
ylthiomethoxy linking groups can be cleaved by Hg 2+ ; 2-(X)-ethyloxy (where X = a halogen) linking groups can be 

so cleaved by Zn or Mg; 2-hydroxyethyloxy linking groups can be cleaved by oxidation (e.g.. with Pb(OAc) 4 ). 

[0138] Preferred linkers are those that are cleaved by acid or photolysis. Several of the acid-labile linkers that have 
been developed for solid phase peptide synthesis are useful for linking tags to MO Is. Some of these linkers are 
described in a recent review by Lloyd- Williams et al. (Tetrahedron 49:1 1065-1 1 133. 1993). One useful type of linker is 
based upon p-aikoxybenzyl alcohols, of which two. 4-hydroxymethylphenoxyacetic acid and 4-(4-hydroxymethyl-3- 

55 methoxyphenoxy)butyric acid, are commercially available from Advanced ChemTech (Louisville, KY). Both linkers can 
be attached to a tag via an ester linkage to the benzylalcohol. and to an amine-containing MOI via an amide linkage to 
the carboxylic acid. Tags linked by these molecules are released from the MOi with varying concentrations of trifluoro- 
acetic acid. The cleavage of these linkers results in the liberation of a carboxylic acid on the tag. Acid cleavage of tags 



29 



D -d 

EP0 992 511 A1 

attached through related linkers, such as 2,4<jimethoxy^ , -(c»rtx)xymethyloxy)-benzhydry!amine (available from 
Advanced ChemTech in FMOC-protected form), results in liberation of a carboxylic amide on the released tag. 
[01 39] The photolabile linkers useful for this application have also been for the most part developed for solid phase 
peptide synthesis (see Lloyd- Williams review). These linkers are usually based on 2-nitrobenzylesters or 2-nitrobenzy- 
lamides. Two examples of photolabile linkers that have recently been reported in the literature are 4-(4-(1-Fmoc- 
amino)ethyl)-2-methoxy-5-nitrophenoxy)butanoic acid (Holmes and Jones, J. Org. Chem. 60:2318-2319. 1995) and 3- 
(Fmoc-amino)-3-(2-nitrophenyl)propionic acid (Brown et al, Molecular Diversity 7:4-12, 1995). Both linkers can be 
attached via the carboxylic acid to an amine on the MOI. The attachment of the tag to the linker is made by forming an 
amide between a carboxylic acid on the tag and the amine on the linker. Cleavage of photolabile linkers is usually per- 
formed with UV light of 350 nm wavelength at intensities and times known to those in the art Cleavage of the linkers 
results in liberation of a primary amide on the tag. Examples of photocleavable linkers include nrtrophenyl glycine 
esters, exo- and endo-2-benzonorborneyl chlorides and methane sulfonates, and 3-amino-3(2-nttrophenyl) propionic 
acid. Examples of enzymatic cleavage include esterases which will cleave ester bonds, nucleases which will deave 
phosphodiester bonds, proteases which cleave peptide bonds, etc. 

[0140] A preferred linker component has an ortho-nitrobenzyl structure as shown below: 




wherein one carbon atom at positions a. b. c. d or e is substituted with -L 3 -X. and L 1 (which is preferably a direct bond) 
is present to the left of N(R 1 ) in the above structure. Such a linker component is susceptible to selective photo-induced 
cleavage of the bond between the carbon labeled "a" and N(R 1 ). The identity of R 1 is not typically critical to the cleavage 
reaction, however R 1 is preferably selected from hydrogen and hydrocarbyl. The present invention provides that in the 
above structure, -N(R 1 )- could be replaced with -O-. Also in the above structure, one or more of positions b, c d or e 
may optionally be substituted with alkyl, alkoxy, fluoride, chloride, hydroxy*, carboxyiate or amide, where these substit- 
uents are independently selected at each occurrence. 

[0141] A further preferred linker component with a chemical handle L+, has the following structure: 




wherein one or more of positions b. c.dore is substituted with hydrogen, alkyl, alkoxy. fluoride, chloride, hydroxy!, car- 
boxyiate or amide, R 1 is hydrogen or hydrocarbyl, and R 2 is -OH or a group that either protects or activates a carboxylic 
acid for coupling with another moiety. Fluorocarbon and hydrofluorocarbon groups we preferred groups that activate a 
carboxylic acid toward coupling with another moiety. 

3. Molecule pf Int erest (Mpl) 

[0142] Examples of MOIs include nucleic acids or nucleic acid analogues {e.g., PNA). fragments of nucleic acids 
(i.e.. nucleic acid fragments), synthetic nucleic acids or fragments, oligonucleotides (e.g., DNA or RNA), proteins, pep- 
tides, antibodies or antibody fragments, receptors, receptor ligands, members of a ligand pair, cytokines, hormones, oli- 



30 



EP0 992 511 A1 



gosaccharides, synthetic organic molecules, drugs, and combinations thereof. 

[0143] Preferred MOIs include nucleic acid fragments. Preferred nucleic acid fragments are primer sequences that 
are complementary to sequences present in vectors, where the vectors are used for base sequencing. Preferably a 
nucleic acid fragment is attached directly or indirectly to a tag at other than the 3' end of the fragment; and most prefer- 
ably at the 5' end of the fragment. Nucleic acid fragments may be purchased or prepared based upon genetic data- 
bases (e.g.. Dib et al.. Nature 350:1 52-1 54, 1 996 and CEPH Genotype Database, httpV/www.cephb.fr) and commercial 
vendors (e.g., Promega, Madison, Wl). 

[0144] As used herein, MOI includes derivatives of an MOI that contain functionality useful in joining the MOl to a 
T-L-L+, compound. For example, a nucleic acid fragment that has a phosphodiester at the S end, where the phosphodi- 
ester is also bonded to an alkyleneamine, is an MOL Such an MOI is described in, e.g., U.S. Patent 4,762.779 which is 
incorporated herein by reference. A nucleic acid fragment with an internal modification is also an MOI. An exemplary 
internal modification of a nucleic acid fragment is where the base (e.g., adenine, guanine, cytosine, thymidine, uracil) 
has been modified to add a reactive functional group. Such internally modified nucleic acid fragments are commercially 
available from, e.g.. Glen Research, Herndon. VA. Another exemplary internal modification of a nucleic acid fragment 
is where an abasic phosphoramrdate is used to synthesize a modified phosphodiester which is interposed between a 
sugar and phosphate group of a nucleic acid fragment The abasic phosphoramkJate contains a reactive group which 
allows a nucleic acid fragment that contains this phosphoramidate-derived moiety to be joined to another moiety, e.g. a 
T-L-L+, compound. Such abasic phosphoramidates are commercially available from, e.g., Clonetech Laboratories, Inc., 
Palo Alto. CA. 

4. Chemical Handles (L^ ) 

[0145] A chemical handle is a stable yet reactive atomic arrangement present as part of a first molecule, where the 
handle can undergo chemical reaction with a complementary chemical handle present as part of a second molecule, 
so as to form a covalent bond between the two molecules. For example, the chemical handle may be a hydroxyl group, 
and the complementary chemical handle may be a carboxylic acid group (or an activated derivative thereof, e.g.. a 
hydrofluroaryl ester), whereupon reaction between these two handles forms a covalent bond (specifically, an ester 
group) that joins the two molecules together. 

[0146] Chemical handles may be used in a large number of covalent bond-forming reactions that are suitable for 
attaching tags to linkers, and linkers to MOIs. Such reactions include alkylation {e.g.. to form ethers, thioethers). acyta- 
tion (e.g.. to form esters, amides, carbamates, ureas, thioureas), phosphorylation (e.g.. to form phosphates, phospho- 
nates, phosphpramides. phosphonamides). sulfonylation (e.g.. to form sulfonates, sulfonamides), condensation (e.g.. 
to form imines. oximes. hydrazones), silylation, disulfide formation, and generation of reactive intermediates, such as 
nitrenes or carbenes, by photolysis. In general, handles and bond-forming reactions which are suitable for attaching 
tags to linkers are also suitable for attaching linkers to MOIs, and vice-versa. In some cases, the MOI may undergo prior 
modification or derivitization to provide the handle needed for attaching the linker. 

[0147] One type of bond especially useful for attaching linkers to MOIs is the disulfide bond. Hs formation requires 
the presence of a thiol group ("handle") on the linker, and another thiol group on the MOI. Mild oxidizing conditions then 
suffice to bond the two thiols together as a disulfide Disulfide formation can also be induced by using an excess of an 
appropriate disulfide exchange reagent, e.g.. pyridyl disulfides. Because disulfide formation is readily reversible, the 
disulfide may also be used as the cleavable bond for liberating the tag, if desired. This is typically accomplished under 
similarly mild conditions, using an excess of an appropriate thiol exchange reagent, e.g.. dithiothreitd. 
[01 48] Of particular interest for linking tags (or tags with linkers) to oligonucleotides is the formation of amide bonds. 
Primary aliphatic amine handles can be readily introduced onto synthetic oligonucleotides with phosphoramidites such 
as 6-monomethoxytritylhexylcyanoethyl-N,N-diisopropyl phosphoramidite (available from Glenn Research. Sterling, 
VA). The amines found on natural nucleotides such as adenosine and guanosine are virtually unreactive when com- 
pared to the introduced primary amine. This difference in reactivity forms the basis of the ability to selectively form 
amides and related bonding groups (e.g., ureas, thioureas, sulfonamides) with the introduced primary amine, and not 
the nucleotide amines. 

[0149] As listed in the Molecular Probes catalog (Eugene. OR), a partial enumeration of amine-reactive functional 
groups includes activated carboxylic esters, isocyanates, isothiocyanates. surfonyl halides, and dichlorotriazenes. 
Active esters arc excellent reagents for amine modification since the amide products formed are very stable. Also, these 
reagents have good reactivity with aliphatic amines and low reactivity with the nucleotide amines of oligonucleotides. 
Examples of active esters include N-hydroxysuccinimide esters, pentafluorophenyl esters, tetrafluorophenyl esters, and 
p-nitrophenyl esters. Active esters are useful because they can be made from virtually any molecule that contains a car- 
boxylic acid. Methods to make active esters are listed in Bodansky (Principles of Peptide Chemistry (2d ed), Springer 
Verlag, London, 1993). 



31 



D <D 

EP0 992 511 A1 



5. Linker Attachment 



[01 50] Typically, a single type of linker is used to connect a particular set or family of tags to a particular set or family 
of MOIs. In a preferred embodiment of the invention, a single, uniform procedure may be followed to create all the var- 
ious T-L-MOI structures. This is especially advantageous when the set of T-L-MOI structures is large, because it allows 
the set to be prepared using the methods of combinatorial chemistry or other parallel processing technology. In a similar 
manner, the use of a single type of linker allows a single, uniform procedure to be employed for cleaving all the various 
T-L-MOI structures. Again, this is advantageous for a large set of T-L-MOI structures, because the set may be proc- 
essed in a parallel, repetitive, and/or automated manner. 

[01 51 ] There are, however, other embodiment of the present invention, wherein two or more types of linker are used 
to connect different subsets of tags to corresponding subsets of MOIs. In this case, selective cleavage conditions may 
be used to cleave each of the linkers independently, without cleaving the linkers present on other subsets of MOIs. 
[0152] A large number of covalent bond-forming reactions are suitable for attaching tags to linkers, and linkers to 
MOIs. Such reactions include alkylation (e.g., to form ethers, thioethers), acylation (e.g., to form esters, amides, car- 
bamates, ureas, thioureas), phosphorylation (e.g.. to form phosphates, phosphonates. phosphoramides. phosphona- 
mides). sulfonylation (e.g.. to form sulfonates, sulfonamides), condensation (e.g. to form imines. oximes. hydrazones). 
silylation, disulfide formation, and generation of reactive intermediates, such as nitrenes or carbenes, by photolysis. In 
general, handles and bond-forming reactions which are suitable for attaching tags to linkers are also suitable for attach- 
ing linkers to MOIs. and vice-versa. In some cases, the MOI may undergo prior modification or derivitization to provide 
the handle needed for attaching the linker. 

[0153] One type of bond especially useful for attaching linkers to MOIs is the disulfide bond. Its formation requires 
the presence of a thiol group ("handle") on the linker, and another thiol group on the MOI. Mild oxidizing conditions then 
suffice to bond the two thiols together as a disulfide. Disulfide formation can also be induced by using an excess of an 
appropriate disulfide exchange reagent, e.g., pyridyl disulfides. Because disulfide formation is readily reversible the 
disulfide may also be used as the cleavable bond for liberating the tag. H desired. This is typically accomplished under 
similarly mild conditions, using an excess of an appropriate thiol exchange reagent e.g.. dithiothreitol. 
[0154] Of particular interest for linking tags to oligonucleotides is the formation of amide bonds. Primary aliphatic 
amine handles can be readily introduced onto synthetic oligonucleotides with phosphoramidites such as 6-monometh- 
oxytritylhexylcyanoethyl-N.N-diisopropyl phosphoramidite (available from Glenn Research, Sterling, VA). The amines 
found on natural nucleotides such as adenosine and guanosine are virtually unreactive when compared to the intro- 
duced primary amine. This difference in reactivity forms the basis of the ability to selectively form amides and related 
bonding groups (e.g. ureas, thioureas, sulfonamides) with the introduced primary amine, and not the nucleotide 
amines. 

[0155] As listed in the Molecular Probes catalog (Eugene. OR), a partial enumeration of amine-reactive functional 
groups includes activated carboxylic esters, isocyanates. isothiocyanates. sulfonyl halides. and dichlorotriazenes 
Active esters are excellent reagents for amine modification since the amide products formed are very stable. Also, these 
reagents have good reactivity with aliphatic amines and low reactivity with the nucleotide amines of oligonucleotides 
Examples of active esters include N-hydroxysuccinimide esters, pentaf luorophenyl esters, tetraf luorophenyl esters and 
p-nitrophenyl esters. Active esters are useful because they can be made from virtually any molecule that contains a car- 
boxylic acid. Methods to make active esters are listed in Bodansky {Principles of Peptide Chemistry (26 ed ) SDrinoer 
Verlag, London, 1993). * 
[0156] Numerous commercial cross-linking reagents exist which can serve as linkers (e.g., see Pierce Cross-link- 
ers, Pierce Chemical Co., Rockford. IL). Among these are homobifunctional amine-reactive cross-linking reagents 
which are exemplified by homobifunctional imidoesters and N-hydroxysuccinimidyl (NHS) esters. There also exist het- 
erobifunctional cross-linking reagents possess two or more different reactive groups that allows for sequential reac- 
tions. Imidoesters react rapidly with amines at alkaline pH. NHS-esters give stable products when reacted with primary 
or secondary amines. Maleimides. alkyl and aryl halides, alpha-haloacyls and pyridyl disulf ides are thiol reactive Maie- 
imides are specific for thiol (surfhydryl) groups in the pH range of 6.5 to 7.5, and at alkaline pH can become amine reac- 
tive. The thioether linkage is stable under physiological conditions. Alpha-haloacetyl cross-linking reagents contain the 
.odoacetyl group and are reactive towards sulfhydryls. Imidazoles can react with the iodoacetyl moiety, but the reaction 
is very slow. Pyridyl disulfides react with thiol groups to form a disulfide bond. Carbodiimides couple carboxyls to pri- 
mary amines of hydrazides which give rises to the formation of an acyl- hydrazine bond. The arylazides are photoaff inity 
reagents which are chemically inert until exposed to UV or visible light When such compounds are photolyzed at 250- 
460 nm. a reactive aryl nitrene is formed. The reactive aryl nitrene is relatively non-specific. Glyoxals are reactive 
towards guanidinyl portion of arginine. 

[01 57] In one typical embodiment of the present invention, a tag is first bonded to a linker, then the combination of 
tag and linker is bonded to a MOI, to create the structure T-L-MOI. Alternatively, the same structure is formed by first 
bonding a linker to a MOI, and then bonding the combination of linker and MOI to a tag. An example is where the MOI 



32 



EP 0 992 511 A1 

is a DNA primer or oligonucleotide. In that case, the tag is typically first bonded to a linker, then the T-L is bonded to a 
DNA primer or oligonucleotide, which is then used, for example, in a sequencing reaction. 

[0158] One useful form in which a tag could be reversibly attached to an MOI {e.g., an oligonucleotide or DNA 
sequencing primer) is through a chemically labile linker. One preferred design for the linker allows the linker to be 
5 cleaved when exposed to a volatile organic acid, for example, trifluoroacetic acid (TFA). TFA in particular is compatible 
with most methods of MS ionization, including electrospray. 

[0159] As described in detail below, the invention provides a method for determining the sequence of a nucleic acid 
molecule. A composition which may be formed by the inventive method comprises a plurality of compounds of the for- 
mula: 

10 

T^-L-MOI 

wherein T™ 8 is an organic group detectable by mass spectrometry. T™ 8 contains carbon, at least one of hydrogen 
and fluoride, and may contain optional atoms induding oxygen, nitrogen, sulfur, phosphorus and iodine. In the formula. 

/5 L is an organic group which allows a T^-containing moiety to be cleaved from the remainder of the compound upon 
exposure of the compound to cleavage condition. The cleaved T ms -containing moiety includes a functional group which 
supports a single ionized charge state when each of the plurality of compounds is subjected to mass spectrometry. The 
functional group may be a tertiary amine, quaternary amine or an organic acid. In the formula. MOI is a nucleic acid 
fragment which is conjugated to L via the 5' end of the MOI. The term "conjugated" means that there may be chemical 

20 groups intermediate L and the MOI, e.g., a phosphodiester group and/or an alkylene group. The nucleic acid fragment 
may have a sequence complementary to a portion of a vector, wherein the fragment is capable of priming nucleotide 
synthesis. 

[0160] In the composition, no two compounds have either the same T" 8 or the same MOI. In other words, the com- 
position includes a plurality of compounds, wherein each compound has both a unique J ms and a unique nucleic acid 

25 fragment (unique in that it has a unique base sequence). In addition, the composition may be described as having a 
plurality of compounds wherein each compound is defined as having a unique T" 18 . where the T" 18 is unique in that no 
other compound has a T 718 that provides the same signal by mass spectrometry. The composition therefore contains a 
plurality of compounds, each having a T™ with a unique mass. The composition may also be described as having a 
plurality of compounds wherein each compound is defined as having a unique nucleic acid sequence. These nucleic 

30 acid sequences are intentionally unique so that each compound will serve as a primer for only one vector, when the 
composition is combined with vectors for nucleic acid sequencing. The set of compounds having unique Tms groups is 
the same set of compounds which has unique nucleic acid sequences. 

[0161] Preferably, the T™ 8 groups are unique in that there is at least a 2 amu, more preferably at least a 3 amu. and 
still more preferably at least a 4 amu mass separation between the T* 8 groups of any two different compounds. In the 
35 composition, there are at least 2 different compounds, preferably there are more than 2 different compounds, and more 
preferably there are more than 4 different compounds. The composition may contain 100 or more different compounds, 
each compound having a unique T™ 8 and a unique nucleic acid sequence. 

[0162] Another composition that is useful in, e.g., determining the sequence of a nucleic acid molecule, includes 
water and a compound of the formula T m8 -L-MOI. wherein T™ is an organic group detectable by mass spectrometry. 

40 T" 18 contains carbon, at least one of hydrogen and fluoride, and may contain optional atoms including oxygen, nitrogen, 
sulfur, phosphorus and iodine. In the formula. L is an organic group which allows a T ms -containing moiety to be cleaved 
from the remainder of the compound upon exposure of the compound to cleavage condition. The cleaved T ms -contain- 
ing moiety includes a functional group which supports a single ionized charge state when each of the plurality of com- 
pounds is subjected to mass spectrometry. The functional group may be a tertiary amine, quaternary amine or an 

45 organic acid. In the formula, MOI is a nucleic acid fragment attached at its 5* end. 

[0163] In addition to water, this composition may contain a buffer, in order to maintain the pH of the aqueous com- 
position within the range of about 5 to about 9. Furthermore, the composition may contain an enzyme, salts (such as 
MgCI 2 . and NaCI) and one of dATP, dGTP, dCTP, and dTTR A preferred composition contains water, T ms -L-MOI and 
one (and only one) of ddATP, ddGTP, ddCTP, and ddTTP. Such a composition is suitable for use in the dideoxy sequenc- 

50 ing method. 

[0164] The invention also provides a composition which contains a plurality of sets of compounds, wherein each set 
of compounds has the formula: 

T^-L-MOI 

55 

wherein, 

[0165] T™ is an organic group detectable by mass spectrometry, comprising carbon, at least one of hydrogen and 
fluoride, and optional atoms selected from oxygen, nitrogen, sulfur, phosphorus and iodine. L is an organic group which 



33 



J) 



EP0 992 511 A1 



allows T ms -containing moiety to be cleaved from the remainder of the compound, wherein the T^-containing moiety 
comprises a functional group which supports a single ionized charge state when the compound is subjected to mass 
spectrometry and is selected from tertiary amine, quaternary amine and organic acid. The MOI is a nucleic acid frag- 
ment wherein L is conjugated to MOI at the MOI's 5" end. 

[0166] Within a set. all members have the same T m8 group, and the MOI fragments have variable lengths that ter- 
minate with the same dideoxynucleotide selected from ddAMR ddGMR ddCMP and ddTMP; and between sets the T" 8 
groups differ by at least 2 amu. preferably by at least 3 amu. The plurality of sets is preferably at least 5 and may" number 
100 or more. 

[0167] m a preferred composition comprising a first plurality of sets as described above, there is additionally 
present a second plurality of sets of compounds having the formula 

"T^L-MOI 

wherein T*™ is an organic group detectable by mass spectrometry, comprising carbon, at least one of hydrogen and flu- 
oride, and optional atoms selected from oxygen, nitrogen, sulfur, phosphorus and iodine. L is an organic group which 
allows a ^-containing moiety to be cleaved from the remainder of the compound, wherein the T^-containing moiety 
comprises a functional group which supports a single ionized charge state when the compound is subjected to mass 
spectrometry and is selected from tertiary amine, quaternary amine and organic acid. MOI is a nucleic acid fragment 
wherein L is conjugated to MOI at the MOI's 5' end. All members within the second plurality have an MOI sequence 
which terminates with the same dideoxynucleotide selected from ddAMP. ddGMP. ddCMP and ddTMP- with the proviso 
tat the d.deoxynucleotide present in the compounds of the first plurality is not the same dideoxynuclTOtjde present in 
the compounds of the second plurality. 

[0168] The invention also provides a kit tor DNA sequencing analysis. The kit comprises a plurality of container 
sets, where each container set includes at least five containers. The first container contains a vector. The second third 
fourth and fifth containers contain compounds of the formula: 

T^L-MOI 

wherein T™ is an organic group detectable by mass spectrometry, comprising carbon, at least one of hydrogen and flu- 
oride, and optional atoms selected from oxygen, nitrogen, sulfur, phosphorus and iodine. L is an organic group which 
allows a T -containing moiety to be cleaved from the remainder of the compound, wherein the T^-containing moiety 
compnses a functional group which supports a single ionized charge state when the compound is subjected to mass 
spectrometry and is selected from tertiary amine, quaternary amine and organic acid. MOI is a nucleic acid fragment 
wherein L is conjugated to MOI at the MOI's 5' end. The MOI for the second, third, fourth and fifth containers is identical 
and complementary to a portion of the vector within the set of containers, and the V s group within each container is 
different from the other T" 18 groups in the kit. 

[0169] Preferably, within the kit. the plurality is at least 3. i.e.. there are at least three sets of containers More pref- 
erably, there are at least 5 sets of containers. 

[0170] As noted above, the present invention provides compositions and methods for determining the sequence of 
nucleic acid molecules. Briefly, such methods generally comprise the steps of (a) generating tagged nucleic acid frag- 
ments which are complementary to a selected nucleic acid molecule (e.g.. tagged fragments) from a first terminus to a 
second terminus of a nucleic acid molecule), wherein a tag is correlative wilh a particular or selected nucleotide and 
may be detected by any of a variety of methods, (b) separating the tagged fragments by sequential length, (c) cleaving 
a tag from a tagged fragment, and (d) detecting the tags, and thereby determining the sequence of the nucleic acid mol- 
ecule. Each of the aspects will be discussed in more detail below. 

B. SEQUEN CING METHODS AND STRAT^F.^ 

[0171] As noted above, the present invention provides methods for determining the sequence of a nucleic acid mol- 
ecule. Briefly, tagged nucleic acid fragments are prepared. The nucleic acid fragments are complementary to a selected 
target nucleic acid molecule. In a prefened embodiment, the nucleic acid fragments are produced from a first terminus 
to a second terminus of a nucleic acid molecule, and more preferably from a 5' terminus to a 3' terminus. In other pre- 
ferred embodiments, the tagged fragments are generated from 5'-tagged oligonucleotide primers or tagged dideoxynu- 
cleotide terminators. A tag of a tagged nucleic acid fragment is conelafive with a particular nucleotide and is detectable 
by spectrometry (including fluorescence, but preferably other than fluorescence), or by potentiometry. In a preferred 
embodiment, at least five tagged nucleic acid fragments are generated and each tag is unique for a nucleic acid frag- 
ment. More specifically, the number of tagged fragments will generally range from about 5 to 2,000 The tagged nucleic 
acid fragments may be generated from a variety of compounds, including those set forth above It will be evident to one 



34 



EP0 992 511 A1 



in the art tat the methods of the present invention are not limited to use only of the representative compounds and com- 
positions described herein. 

[0172] Following generation of tagged nucleic acid fragments, the tagged fragments are separated by sequential 
length. Such separation may be performed by a variety of techniques. In a preferred embodiment, separation is by liquid 
chromatography (LC) and particularly preferred is HPLC. Next, the tag is cleaved from the tagged fragment The partic- 
ular method for breaking a bond to release the tag is selected based upon the particular type of susceptibility of the 
bond to deavage. For example, a light-sensitive bond (i.e., one that breaks by light) will be exposed to light The 
released tag is detected by spectrometry or potentiometry. Preferred detection means are mass spectrometry, infrared 
spectrometry, ultraviolet spectrometry and potentiostatic amperometry (e.g., with an amperometric detector or coule- 
metric detector). 

[0173] It will be appreciated by one in the art that one or more of the steps may be automated, e.g., by use of an 
instrument. In addition, the separation, cleavage and detection steps may be performed in a continuous manner (e.g., 
continuous flow/continuous fluid path of tagged fragments through separation to cleavage to tag detection). For exam- 
ple, the various steps may be incorporated into a system, such that the steps are performed in a continuous manner. 
Such a system is typically in an instrument or combination of instruments format. For example, tagged nucleic acid frag- 
ments that are separated (e.g.. by HPLC) may flow into a device for cleavage (e.g., a photo-reactor) and then into a tag 
detector (e.g.. a mass spectrometer or coulometric or amperometric detector). Preferably, the device for cleavage is tun- 
able so that an optimum wavelength for the cleavage reaction can be selected. 

[0174] It will be apparent to one in the art that the methods of the present invention for nucleic acid sequencing may 
be performed for a variety of purposes. For example, such use of the present methods include primary sequence deter- 
mination for viral, bacterial, prokaryotic and eukaryotic (e.g., mammalian) nucleic acid molecules; mutation detection; 
diagnostics: forensics; identity; and polymorphism detection. 

1. Sequencing Methods 

[0175] As noted above, compounds including, those of the present invention may be utilized for a variety of 
sequencing methods, including both enzymatic and chemical degradation methods. Briefly, the enzymatic method 
described by Sanger (Proc. Natl. Acad. Sti. (USA) 74:5463. 1977) which utilizes dideoxy-terminators. involves the syn- 
thesis of a DNA strand from a single-stranded template by a DNA polymerase. The Sanger method of sequencing 
depends on the fact that that dideoxynucleotides (ddNTPs) are incorporated into the growing strand in the same way a 
normal deoxynucleotides (albeit at a lower efficiency). However, ddNTPs differ from normal deoxynucleotides (dNTPs) 
in that they lack the 3'-OH group necessary for chain elongation. When a ddNTP is incorporated into the DNA chain, 
the absence the S'-hydroxy group prevents the formation of a new phosphodi ester bond and the DNA fragment is ter- 
minated with the ddNTP complementary to the base in the template DNA. The Maxam and Gilbert method (Maxam and 
Gilbert. Proc. Natl. Acad. Sci. (USA) 74:560, 1977) employs a chemical degradation method of the original DNA (in 
both cases the DNA must be clonal). Both methods produce populations of fragments that begin from a particular point 
and terminate in every base that is found in the DNA fragment that is to be sequenced. The termination of each frag- 
ment is dependent on the location of a particular base within the original DNA fragment The DNA fragments are sep- 
arated by polyacrylamide gel electrophoresis and the order of the DNA bases (A,C.T,G) is read from a autoradiograph 
of the gel. 

2. Exonuclease DNA Sequencing 

[0176] A procedure for determining DNA nucleotide sequences was reported by Labeit et al. (S. Labeit. H. Lebrach 
& R. S. Goody. DNA 5: 173-7, 1986; A new method of DNA sequencing using deoxynucieoside alpha-thiotriphos- 
phates). In the first step of the method, four DNAs. each separately substituted with a different deoxynucieoside phos- 
phorothioate in place of the corresponding monophosphate, are prepared by template-directed polymerization 
catalyzed by DNA polymerase. In the second step, these DNAs are subjected to stringent exonuclease III treatment, 
which produces only fragments terminating with a phosphorothioate internucleotide linkage. These can then be sepa- 
rated by standard gel electrophoresis techniques and the sequence can be read directly as in presently used sequenc- 
ing methods. Porter et al. (K. W. Porter. J. Tomasz. F. Huang, A. Sood & B. R. Shaw. Biochemistry 34: 11963-11969, 
1995: N7-cyanoborane-2 , -deoxyguanosine 5*-triphosphate is a good substrate for DNA polymerase) described a new 
set of boron-substituted nucleotide analogs which are also exonuclease resistant and good substrates for a number of 
polymerases: these base are also suitable for exonuclease DNA sequencing. 

3. A Simplified Strategy for Sequencing Large Numbers of Full Length cDNAs. 

[0177] cDNA sequencing has been suggested as an alternative to generating the complete human genomic 



35 



-D D 

EP0 992 511 A1 

sequence. Two approaches have been attempted. The first involves generation of expressed sequence tags (ESTs) 
through a single DNA sequence pass at one end of each cDNA clone. This method has given insights into the distribu- 
tion of types of expressed sequences and has revealed occasional useful homology with genomic fragments, but overall 
has added little to our knowledge base since insufficient data from each clone is provided. The second approach is to 
generate complete cDNA sequence which can indicate the possible function of the cDNAs. Unfortunately most cDNAs 
are of a size range of 1-4 kilobases which hinders the automation of full-length sequence determination. Currently the 
most efficient method for large scale, high throughput sequence production is from sequencing from a vector/primer 
site, which typically yields less than 500 bases of sequence from each flank. The synthesis of new oligonucleotide prim- 
ers of length 15-18 bases for primer walking* can allow closure of each sequence. An alternative strategy for full length 
cDNA sequencing is to generate modified templates that are suitable for sequencing with a universal primer, but provide 
overlapping coverage of the molecules. 

[0178] Shotgun sequencing methods can be applied to cDNA sequencing studies by preparing a separate library 
from each cDNA clone. These methods have not been used extensively for the analysis of the 1.5 - 4.0 Wlobase frag- 
ments, however, as they are very labor intensive during the initial cloning phase. Instead they have generally been 
applied to projects where the target sequence is of the order of 15 to 40 kilobases. such as in lambda or cosmid inserts. 

4. AnaloQv of cDNA with Genomic Sequencing 

[01 79] Despite the typically different size of the individual clones to be analyzed in cDNA sequencing, there are sim- 
ilarities with the requirements for large scale genomic DNA sequencing. In addition to a low cost per base, and a high 
throughput, the ideal strategy for full length cDNA sequencing will have a high accuracy. The favored current method- 
ology for genomic DNA sequencing involves the preparation of shotgun sequencing libraries from cosmids, followed by 
random sequencing using ABI fluorescent DNA sequencing instruments, and closure (finishing) by directed efforts. 
Overall there is agreement that the fluorescent shotgun approach is superior to current alternatives in terms of effi- 
ciency and accuracy. The initial shotgun library quality is a critical determinant of the ease and quality of sequence 
assembly. The high quality of the available shotgun library procedure has prompted a strategy for the production of mul- 
tiplex shotgun libraries containing mixtures of the smaller cDNA clones. Here the individual clones to be sequenced are 
mixed prior to library construction and then identified following random sequencing, at the stage of computer analysis. 
Junctions between individual clones are labeled during library production either by PCR or by identification of vector 
arm sequence. 

[01 80] Clones may be prepared both by microbial methods or by PCR. When using PCR. three reactions from each 
clone are used in order to minimize the risk for errors. 

[01 81 ] One pass sequencing is a new technique designed to speed the identification of important sequences within 
a new region of genomic DNA. Briefly, a high quality shotgun library is prepared and then the sequences sampled to 
obtain 80 - 95% coverage. For a cosmid this would typically be about 200 samples. Essentially all genes are likely to 
have at least one exon detected in this sample using either sequence similarity (BLAST) or exon structure (GRAIL2) 
screening. 

[0182] "Skimming" has been successfully applied to cosmids and Pis. One pass sequencing is potentially the fast- 
est and least expensive way to find genes in a positional cloning project. The outcome is virtually assured. Most inves- 
tigators are currently developing cosmid contigs for exon trapping and related techniques. Cosmids are completely 
suitable for sequence skimming. Pi and other BACs could be considerably cheaper since there is savings both in shot- 
gun library construction and minimization of overlaps. 

5. Shotoun Sequencing 

[0183] Shotgun DNA sequencing starts with random fragmentation of the target DNA. Random sequencing is then 
used to generate the majority of the data. A directed phase then completes gaps, ensuring coverage of each strand in 
both directions. Shotgun sequencing offers the advantage of high accuracy at relatively low cost The procedure is best 
suited to the analysis of relatively large fragments and is the method of choice in large scale genomic DNA sequencing. 
[0184] There are several factors that are important in making shotgun sequencing accurate and cost effective. A 
major consideration is the quality of the shotgun library that is generated, since any clones that do not have inserts, or 
have chimeric inserts, will result in subsequent inefficient sequencing. Another consideration is the careful balancing of 
the random and the directed phases of the sequencing, so that high accuracy is obtained with a minimal loss of effi- 
ciency through unnecessary sequencing. 

6. Sequencing Chemistry: Tagged-Terminator Chemistry 

[0185] There are two types of fluorescent sequencing chemistries cunently available: dye primer, where the primer 



36 



G • 03 

EP0 992 511 A1 



is f luorescently labeled, and dye terminator, where the dideoxy terminators are labeled. Each of these chemistries can 
be used with either Taq DNA polymerase or sequenase enzymes. Sequenase enzyme seems to read easily through G- 
C rich regions, palindromes, simple repeats and other difficult to read sequences. Sequenase is also good for sequenc- 
ing mixed populations. Sequenase sequencing requires 5 jig of template, one extension and a multi-step cleanup proc- 

5 ess. Tagged-primer sequencing requires four separate reactions, one for each of A, C. G and T and then a laborious 
cleanup protocol. Taq terminator cycle sequencing chemistry is the most robust sequencing method. With this method 
any sequencing primer can be used. The amount of template needed is relatively small and the whole reaction process 
from setup to cleanup is reasonably easy, compared to sequenase and dye primer chemistries. Only 1.5 ug of DNA 
template and 4 pm of primer are needed. To this a ready reaction mix is added. This mix consists of buffer, enzyme. 

w dNTPs and labeled dideoxynucleotides. This reaction can be done in one tube as each of the four dideoxies is labeled 
with a different fluorescent dye. These labeled terminators are present in this mix in excess because they are difficult 
to incorporate during extension. With unclean DNA the incorporation of these high molecular weight dideoxies can be 
inhibited. The premix includes dITP to minimize band compression. The use of Taq as the DNA polymerase allows the 
reactions to be run at high temperatures to minimize secondary structure problems as well as non-specific primer bind- 

is ing. The whole cocktail goes through 25 cycles of denaturation. annealing and extension in a thermal cycler and the 
completed reaction is spun through a Sephadex G50 (Pharmacia, Piscataway. NJ) column and is ready for gel loading 
alter five minutes in a vacuum dessicator. 

7. Designing Primers 

20 

[0186] When designing primers, the same criteria should be used as for designing PCR primers. In particular, prim- 
ers should preferably be 18 to 20 nucleotides long and the 3-prime end base should be a G or a C Primers should also 
preferably have a Tm of more than 50°C. Primers shorter than 1,8 nucleotides will work but are not recommended. The 
shorter the primer the greater the probability of it binding at more than one site on the template DNA, and the lower its 

25 Tm. The sequence should have 100% match with the template. Any mismatch, especially towards the 3-prime end will 
greatly diminish sequencing ability. However primers with 5-prime tails can be used as long as there is about 18 bases 
at 3-prime that bind. If one is designing a primer from a sequence chromatogram, an area with high conf idence must 
be used. As one moves out past 350 to 400 bases on a standard chromatogram. the peaks get broader and the base 
calls are not as accurate. As described herein, the primer may possess a 5* handle through which a linker or linker tag 

30 may be attached. 

8. Nucleic Acid Template Preparation 

[0187] The most important factor in tagged-primer DNA sequencing is the quality of the template. Briefly, one com- 

35 mon misconception is that if a template works in manual sequencing, it should work in automated sequencing. In fact, 
if a reaction works in manual sequencing it may work in automated sequencing, however, automated sequencing is 
much more sensitive and a poor quality template may result in little or no data when fluorescent sequencing methods 
are utilized. High salt concentrations and other cell material not properly extracted during template preparation, includ- 
ing RNA, may likewise prevent the ability to obtain accurate sequence information. Many mini and maxi prep protocols 

40 produce DNA which is good enough for manual sequencing or PCR. but not for automated (tagged-primer) sequencing. 
Also the use of phenol is not at all recommended as phenol can intercalate in the helix structure. The use of 100% chlo- 
roform is sufficient There are a number of DNA preparation methods which are particularly preferred for the tagged 
primer sequencing methods provided herein. In particular, maxi preps which utilize cesium chloride preparations or Qia- 
gen (Chatsworth. CA) maxi prep, columns (being careful not to overload) are preferred. For mini preps, columns such 

45 as Promega's Magic Mini prep (Madison, Wl), may be utilized. When sequencing DNA fragments such as PCR frag- 
ments or restriction cut fragments, it is generally preferred to cut the desired fragment from a low melt argarose gel and 
then purify with a product such as GeneClean (La Jolla, CA). It is very important to make sure that only one band is cut 
from the gel. For PCR fragments the PCR primers or internal primers can be used in order to ensure that the appropri- 
ate fragment was sequenced. To get optimum performance from the sequence analysis software, fragments should be 

so larger than 200 bases. Double stranded or single stranded DNA can be sequenced by this method. 

[0188] An additional factor generally taken into account when preparing DNA for sequencing is the choice of host 
strain. Companies selling equipment and reagents for sequencing, such as ABI (Foster City. CA) and Qiagen (Chats- 
worth, CA), typically recommend preferred host strains, and have previously recommended strains such as DH5 alpha, 
HB101. XL-1 Blue. JM109. M V1 190. Even when the DNA preparations are very clean, there are other inherent factors 

£5 which can make H difficult to obtain sequence. G-C rich templates are always difficult to sequence through, and second- 
ary structure can also cause problems. Sequencing through a long repeats often proves to be difficult For instance as 
Taq moves along a poly T stretch, the enzyme often falls off the template and jumps back on again, skipping a T. This 
results in extension products with X amount of Ts in the poly T stretch and fragments with X-1, X-2 etc. amounts of Ts 



37 



D D 

EP 0 992 511 A1 

in the poly T stretch. The net effect is that more than one base appears in each position making the sequence impos- 
sible to read. " 

9. Use of Molecularly Distinct Clonino Vectors 

5 

[0189] Sequencing may also be accomplished utilizing universal cloning vector (M13) and complementary 
sequencing primers. Briefly, for present cloning vectors the same primer sequence is used and only 4 tags are 
employed (each tag is a different fluorophore which represents a different terminator (ddNTP)). every amplification 
process must take place in different containers (one DNA sample per container). That is. it is imposstoJe to mix two or 

to different DNA samples in the same amplification process. With only 4 tags available, only one DNA sample can be run 
per gel lane. There is no convenient means to deconvolute the sequence of more than one DNA sample with only 4 
tags. (In this regard, workers in the field take great care not to mix or contaminate different DNA samples when usina 
current technologies.) " 
[01 90] A substantial advantage is gained when multiples of 4 tags can be run per get lane or respective separation 

is process. In particular, utilizing tags of the present invention, more than one DNA sample in a single amplification reac- 
tion or container can be processed. When multiples of 4 tags are available for use. each tag set can be assigned to a 
particular DNA sample that is to be amplified. (A tag set is composed of a series of 4 different tags each with a unique 
property. Each tag is assigned to represent a different dideoxy-terminator. ddATP. ddGTP, ddCTP. or ddTTP To employ 
this advantage a series of vectors must be generated in which a unique priming site is inserted. A unique priming site 
20 is simply a stretch of 18 nucleotides which differs from vector to vector. The remaining nucleotide sequence is con- 
served from vector to vector. A sequencing primer is prepared (synthesized) which corresponds to each unique vector 
Each unique primer is derived (or labelled) with a unique tag set. 

[01 91] With these respective molecular biology tools in hand, it is possible in the present invention to process mul- 
tiple samples in a single container. First. DNA samples which are to be sequenced are cloned into the multiplicity of vec- 
25 tors. For example, if 1 00 unique vectors are available. 1 00 ligation reactions, plating steps, and picking of plaques are 
performed. Second, one sample from each vector type is pooled making a pool of 100 unique vectors containing 100 
unique DNA fragments or samples. A given DNA sample is therefore identified and automatically assigned a primer set 
with the associated tag set The respective primers, buffers, polymerase(s). ddNTPs. dNTPs and co-factors are added 
to the reaction container and the amplification process is carried out The reaction is then subjected to a separation step 
so and the respective sequence is established from the temporal appearance of tags. The ability to pool muttiple DNA 
samples has substantial advantages. The reagent cost of a typical PCR reaction is about $2.00 per sample With the 
method described herein the cost of amplification on a per sample basis could be reduced at least by a factor of 100 
Sample handling could be reduced by a factor of at least 100. and materials costs could be reduced. The need for larae 
scale amplification robots would be obviated. 



35 



10- Sequencing Vectors tor Cleavable Mass Spectroscopy Taooinq 



[0192] Using cleavable mass spectroscopy tagging (CMST) of the present invention, each individual sequencing 
reaction can be read independently and simultaneously as the separation proceeds. In CMST sequencing a different 

40 primer is used tor each cloning vector: each reaction has 20 different primers when 20 clones are used per'pool Each 
primer corresponds to one of the vectors, and each primer is tagged with a unique CMST molecule. Four reactions are 
performed on each pooled DNA sample (one for each base), so every vector has four oligonucleotide primers each one 
.dentical in sequence but tagged with a different CMS tag. The four separate sequencing reactions are pooled and run 
together. When 20 samples are pooled. 80 tags are used (4 bases per sample times 20 samples) and all 80 are 

is detected simultaneously as the gel is run. 

[01 93] The construction of the vectors may be accomplished by cloning a random 20-mer on either side of a restric- 
tion site. The resulting dones are sequenced and a number chosen for use as vectors. Two oligonucleotides are pre- 
pared for each vector chosen, one homologous to the sequence at each side of the restriction site, and each orientated 
so that the 3'-end is towards the restriction site. Four tagged preparations of each primer are prepared, one for each 

so base in the sequencing reactions and each one labeled with a unique CMS tag. 

11 Advantages of Sequen cing bv the tJse of Rpyprsible Tags 

[0194] There are substantial advantages when cleavable tags are used in sequencing and related technologies 
55 First, an increase in sensitivity will contribute to longer read lengths, as will the ability to collect tags for a specified 
period of time prior to measurement. The use of cleavable tags permits the development of a system that equalizes 
bandwidth over the entire range of the gel (1-1500 nucleotides (nt), for example). This will greatly impact the abilitv to 
obtain read lengths greater than 450 nt 



38 



EP0 992 511 A1 



[0195] The use of deavable multiple tags (MW identifiers) also has the advantage that multiple DNA samples can 
be run on a single gel lane or separation process. For example, it is possible using the methodologies disclosed herein 
to combine at least 96 samples and 4 sequencing reactions (A.G.T.C) on a single lane or fragment sizing process. If 
multiple vectors are employed which possess unique priming sites, ten at least 384 samples can be combined per gel 
lane (the cfifferent terminator reactions cannot be amplified together with this scheme). When the ability to employ 
cleavable tags is combined with the ability to use multiple vectors, an apparent 10.000-fold increase in DNA sequencing 
thoughput is achieved. Also, in the schemes described herein, reagent use is decreased, disposables decrease, with a 
resultant decrease in operating costs to the consumer. 

[0196] An additional advantage is gained from the ability to process internal controls throughout the entire method- 
ologies described here. For any set of samples, an internal control nucleic acid can be placed in the sampte(s). This is 
not possible with the current configurations. This advantage permits the control of the amplification process, the sepa- 
ration process, the tag detection system and sequence assembly. This is an immense advantage over current systems 
in which the controls are always separated from the samples in all steps. 

[0197] The compositions and methods described herein also have the advantage that they are modular in nature 
and can be fitted on any type of separation process or method and in addition, can be fitted onto any type of detection 
system as improvements are made in either types of respective technologies. For example, the methodologies 
described herein can be coupled with "bundled" CE arrays or microfabricated devices that enable separation of DNA 
fragments. 

C. SEPARATION OF DNA FRAGMENTS 

[0198] A sample that requires analysis is often a mixture of many components in a complex matrix For samples 
containing unknown compounds, the components must be separated from each other so that each individual compo- 
nent can be identified by other analytical methods. The separation properties of the components in a mixture are con- 
stant under constant conditions, and therefore once determined they can be used to identify and quantify each of the 
components. Such procedures are typical in chromatographic and electrophoretic analytical separations. 

1. Hiah-Performance Liquid Chromat ography (HPLC) 

[0199] High-Performance liquid chromatography (HPLC) is a chromatographic separations technique to separate 
compounds that are dissolved in solution. HPLC instruments consist of a reservoir of mobile phase, a pump, an injector, 
a separation column, and a detector. Compounds are separated by injecting an aliquot of the sample mixture onto the 
column. The different components in the mixture pass through the column at different rates due to differences in their 
partitioning behavior between the mobile liquid phase and the stationary phase. 

[0200] Recently. IP-RO-HPLC on non-porous PS/DVB particles with chemically bonded alkyl chains have been 
shown to be rapid alternatives to capillary electrophoresis in the analysis of both single and double-strand nucleic acids 
providing simi lair degrees of resolution (Huber et al, 1993, Anal.Biochem., 212. p351; Huber et aL. 1993, Nuc. Acids 
Res.. 21. p1061; Huber et al.. 1993, Biotechniques, 16. p898). In contrast to ion-excahnge chromoatrography, which 
does not always retain double-strand DNA as a function of strand length (Since AT base pairs intereact with the posi- 
tively charged stationary phase, more strongly than GC base-pairs), IP-RP-HPLC enables a strictly size-dependent 
separation. 

[0201] A method has been developed using 100 mM triethylammonium acetate as ion-pairing reagent, phosphodi- 
ester oligonucleotides could be successfully separated on alkylated non-porous 2.3 ^iM poly(styrene-divinylbenzene) 
particles by means of high performance liquid chromatography (Oefner et al., 1994, Anal. Biochem., 223, p39). The 
technique described allowed the separation of PCR products differing only 4 to 8 base pairs in length within a size range 
of 50 to 200 nucledtides. 

2. Electrophoresis 

[0202] Electrophoresis is a separations technique that is based on the mobility of ions (or DNA as is the case 
described herein) in an electric field. Negatively charged DNA charged migrate towards a positive electrode and posi- 
tively-charged ions migrate toward a negative electrode. For safety reasons one electrode is usually at ground and the 
other is biased positively or negatively. Charged species have different migration rates depending on their total charge, 
size, and shape, and can therefore be separated. An electrode apparatus consists of a high-voltage power supply, elec- 
trodes, buffer, and a support for the buffer such as a polyacrylamide gel. or a capillary tube. Open capillary tubes are 
used for many types of samples and the other gel supports are usually used for biological samples such as protein mix- 
tures or DNA fragments. 



39 



D .D 

EP0 992 511 A1 

3. Capitlary Electrophoresis (CE) 

[0203] Capillary electrophoresis (CE) in its various manifestations (free solution, isotachophoresis, isoelectric 
focusing, polyacrylamide gel, micellar electrokinetic "chromatography") is developing as a method for rapid high reso- 
lution separations of very small sample volumes of complex mixtures. In combination with the inherent sensitivity and 
selectivity of MS. CE-MS is a potential powerful technique for bioanalysis. In the novel application disclosed herein, the 
interfacing of these two methods will lead to superior DNA sequencing methods that eclipse the current rate methods 
of sequencing by several orders of magnitude 

[0204] The correspondence between CE and electrospray ionization (ESI) flow rates and the fact that both are facil- 
itated by (and primarily used for) ionic species in solution provide the basis for an extremely attractive combination. The 
combination of both capillary zone electrophoresis (CZE) and capillary isotachophoresis with quadrapole mass spec- 
trometers based upon ESI have been described (Olivares et al., Anal. Chem. 59:1230, 1987; Smith et al Anal Chem 
60:436, 1988; Loo etal., Anal. Chem. 779:404, 1989; Edmonds et al.. J. Chroma. 1989; Loo et al'. J. Microcol- 

umn Sep. 7:223. 1989; Lee et al. J. Chromatog. 458:313. 1988; Smith et al.. J. Chromatog. 450511, 1989; Grese et 
al.. J. Am. Chem. Soc. 7 1 75835. 1989). Small peptides are easily amenable to CZE analysis with good (femtomole) 
sensitivity. 

[0205] The most powerful separation method for DNA fragments is polyacrylamide gel electrophoresis (PAGE), 
generally in a slab gel format. However, the major limitation of the current technology is the relatively long time required 
to perform the gel electrophoresis of DNA fragments produced in the sequencing reactions. An increase magnitude (10- 
fold) can be achieved with the use of capillart electrophoresis which utilize ultrathin gels. In free solution to a first approx- 
imation all DNA migrate with the same mobility as the addition of a base results in the compensation of mass and 
charge. In polyacrylamide gels. DNA fragments sieve and migrate as a function of length and this approach has now 
been applied to CE. Remarkable plate number per meter has now been achieved with cross-linked polyacrylamide 
(10+ 7 plates per meter, Cohen et al.. Proc. Natl. Acad. ScL, USA 55:9660. 1988). Such CE columns as described can 
be employed for DNA sequencing. The method of CE is in principle 25 times faster than slab gei electrophoresis in a 
standard sequencer. For example, about 300 bases can be read per hour. The separation speed is limited in slab gel 
electrophoresis by the magnitude of the electric field which can be applied to the gel without excessive heat production. 
Therefore, the greater speed of CE is achieved through the use of higher field strengths (300 V/cm in CE versus 10 
V/cm in slab gel electrophoresis). The capillary format reduces the amperage and thus power and the resultant heat 
generation. 

[0206] Smith and others (Smith et al.. Nuc. Acids. Res. 18:A4M t 1990) have suggested employing multiple capil- 
laries in parallel to increase throughput. Likewise, Mathies and Huang (Mathies and Huang, Nature 359:167, 1992) 
have introduced capillary electrophoresis in which separations are performed on a parallel array of capillaries and dem- 
onstrated high through-put sequencing (Huang et al.. Anal. Chem. 64:967, 1992. Huang etal.. Anal. Chem. 64:2149 
1 992). The major disadvantage of capillary electrophoresis is the limited amount of sample that can be loaded onto the 
capillary. By concentrating a large amount of sample at the beginning of the capillary, prior to separation, loadability is 
increased, and detection levels can be lowered several orders of magnitude. The most popular method of preconcen- 
tration in CE is sample stacking. Sample stacking has recently been reviewed (Chien and Burgi, Anal. Chem. 64 489A 
1992). Sample stacking depends of the matrix difference. (pH. ionic strength) between the sample buffer and the cap- 
illary buffer, so that the electric field across the sample zone is more than in the capillary region. In sample stacking a 
large volume of sample in a low concentration buffer is introduced for preconcentration at the head of the capillary col- 
umn. The capillary is filled with a buffer of the same composition, but at higher concentration. When the sample ions 
reach the capillary buffer and the lower electric field, they stack into a concentrated zone. Sample stacking has 
increased detectabilities 1 -3 orders of magnitude. 

[0207] Another method of preconcentration is to apply isotachophoresis (ITP) prior to the free zone CE separation 
of analytes. ITP is an electrophoretic technique which allows microliter volumes of sample to be loaded on to the capil- 
lary. In contrast to the low nL injection volumes typically associated with CE. The technique relies on inserting the sam- 
ple between two buffers (leading and trailing electrolytes) of higher and lower mobility respectively, than the analyte 
The technique is inherently a concentration technique, where the analytes concentrate into pure zones migrating with 
the same speed. The technique is currently less popular than the stacking methods described above because of the 
need for several choices of leading and trailing electrolytes, and the ability to separate only cationic or anionic species 
during a separation process. 

[0208] The heart of the DNA sequencing process is the remarkably selective electrophoretic separation of DNA or 
oligonucleotide fragments. It is remarkable because each fragment is resolved and differs by only nucleotide. Separa- 
tions of up to 1000 fragments (1000 bp) have been obtained. A further advantage of sequencing with cleavable tags is 
as follows. There is no requirement to use a slab gel format when DNA fragments are separated by polyacrylamide gel 
electrophoresis when cleavable tags are employed. Since numerous samples are combined (4 to 2000) there is no 
need to run samples in parallel as is the case with current dye-primer or dye-terminator methods (i.e., ABI373 



40 



EP0 992 511 A1 



sequencer). Since there is no reason to run parallel lanes, there is no reason to use a slab gel. Therefore, one can 
employ a tube gel format tor the electrophoretic separation method. Grossman (Grossman et al., Genef. Anal. Tech. 
Appi 9:9, 1992) have shown that considerable advantage is gained when a tube gel format is used in place of a slab 
gel format This is due to the greater ability to dissipate Joule heat in a tube format compared to a slab gel which results 
in faster run times (by 50%), and much higher resolution of high molecular weight DNA fragments (greater than 1000 
nt). Long reads are critical in genomic sequencing. Therefore, the use of cleavable tags in sequencing has the addi- 
tional advantage of allowing the user to employ the most efficient and sensitive DNA separation method which also pos- 
sesses the highest resolution. 

4. Micrpfqbrirated Devices 

[0209] Capillary electrophoresis (CE) is a powerful method for DNA sequencing, forensic analysis, PCR product 
analysis and restriction fragment sizing. CE is far faster than traditional slab PAGE since with capillary gels a far higher 
potential field can be applied. However, CE has the drawback of allowing only one sample to be processed per gel. The 
method combines the faster separations times of CE with the ability to analyze multiple samples in parallel. The under- 
lying concept behind the use of microfabricated devices is the ability to increase the information density in electrophore- 
sis by miniaturizing the lane dimension to about 100 micrometers. The electronics industry routinely uses 
microfabrication to make circuits with features of less than one micron in size. The current density of capillary arrays is 
limited the outside diameter of the capillary tube. Microfabrication of channels produces a higher density of arrays. 
Microfabrication also permits physical assemblies not possible with glass f toers and links the channels directly to other 
devices on a chip. Few devices have been constructed on microchips for separation technologies. A gas chromatography 
(Terry etal., IEEE Trans. Electron Device, ED-26:1880, 1979) and a liquid chromatograph (Manz et al., Sens. Actuators 
B1 :249, 1 990) have been fabricated on silicon chips, but these devices have not been widely used. Several groups have 
reported separating fluorescent dyes and amino acids on microfabricated devices (Manz et al., J. Chromatography 
593:253. 1992, Effenhauser et al., Anal. Chem. 55:2637. 1993). Recently Woolley and Mathies (Woolley and Mathies, 
Proc. Natl. Acad. So. 37:11348. 1994) have shown that photolithography and chemical etching can be used to make 
large numbers of separation channels on glass substrates. The channels are filled with hydroxyethyt cellulose (HEC) 
separation matrices. It was shown that DNA restriction fragments could be separated in as little as two minutes. 

D. CLEAVAGE OF TAGS 

[021 0] As described above, different linker designs will confer cleavability ("lability! under different specific physical 
or chemical conditions. Examples of conditions which serve to cleave various designs of linker include acid, base, oxi- 
dation, reduction, fluoride, thiol exchange, photolysis, and enzymatic conditions. 

[0211] Examples of cleavable linkers that satisfy the general criteria for linkers listed above will be well known to 
those in the art and include those found in the catalog available from Pierce (Rockford, IL). Examples include: 

• ethylene glycobis(succinimidyl succinate) (EGS), an amine reactive cross-linking reagent which is cleavable by 
hydroxylamine (1 M at 37°C for 3-6 hours); 

• disuccinimidyl tartarate (DST) and sulfo-DST, which are amine reactive cross-linking reagents, cleavable by 0.015 
M sodium periodate; 

• bis[2-(succinimidyloxycarbonyloxy)ethyl]sulfone (BSOCOES) and sulfo-BSOCOES, which are amine reactive 
cross-linking reagents, cleavable by base (pH 11.6); 

1.4-di-[3 -(2 , -pyridyldithio(propionamido))butane (DPDPB), a pyridyldithiol crosslinker which is cleavable by thiol 
exchange or reduction; 

• N-[4-(p-azidosalicylamido)-butyl]-3-(2'-pyridydithio)propionamide (APDP). a pyridyldithiol crosslinker which is 
cleavable by thiol exchange or reduction; 

bis-[beta-4-(azidosalicylamido)ethyl]-disulfide. a photoreactive crosslinker which is cleavable by thiol exchange or 
reduction; 

• N-succinimidyl-(4-azidopheny))-1 ,3'dithiopropionate (SADP), a photoreactive crosslinker which is cleavable by thiol 
exchange or reduction; 

sulfosuccininiidyl-2-(7-azido-4-meth^ (SAED). a photoreactive 

crosslinker which is cleavable by thiol exchange or reduction; 

• sulfosuccinimidyl^-fm-azido-o-nitrobenzaznidoJ-ethyl-I.S'dithiopropionate (SAND), a photoreactive crosslinker 
which is cleavable by thiol exchange or reduction. 

[021 2] Other examples of cleavable linkers and the cleavage conditions that can be used to release tags are as fol- 
lows. A silyl linking group can be cleaved by fluoride or under acidic conditions. A 3-, 4-. 5-. or 6-substrtuted-2-nitroben- 



41 



D D 

EP0 992 511 A1 

zyloxy or 2-, 3-, 5-, or 6-substituted-4-nitrobenzyloxy linking group can be cleaved by a photon source (photolysis). A 3- 
. 4-. 5-, or 6-substituted-2-alkoxyphenoxy or 2-, 3-, 5-, or 6-substituted-4-alkoxyphenoxy linking group can be cleaved 
by Ce(NH 4 ) 2 (N0 3 ) 6 (oxidation). A NC0 2 (urethane) linker can be cleaved by hydroxide (base), acid, or UAIH 4 (reduc- 
tion). A 3-pentenyl, 2-butenyl. or 1-butenyl linking group can be cleaved by 0 3 , O s 0^flO A \ or KMn0 4 (oxidation). A 2- 
[3-, 4-, or 5-substituted-furyl]oxy linking group can be cleaved by 0 2 . Br 2 . MeOH. or acid. 

[021 3] Conditions for the cleavage of other labile linking groups include: t-alkyloxy linking groups can be cleaved by 
acid; methyl(dialkyl)methoxy or 4-substituted-2-alkyl-1,3-dioxlane-2-yl linking groups can be cleaved by H 3 0 + * 2- 
silylethoxy linking groups can be cleaved by fluoride or acid; 2-(X)-ethoxy (where X = keto. ester amide, cyano, N0 2 , 
sulfide, sulfoxide, sulfone) linking groups can be cleaved under alkaline conditions; 2-. 3-. 4-, 5-, or 6-substituted-ben- 
zyloxy linking groups can be cleaved by acid or under reductive conditions; 2-butenyloxy linking groups can be cleaved 
by (Ph 3 P) 3 RhCI(H). 3-. 4-, 5-, or 6-substitu1ed-2-bromophenoxy linking groups can be deaved by Li, Mg. or BuLr meth- 
yfthiomethoxy linking groups can be cleaved by Hg 2 +; 2-(X)-ethyloxy (where X = a halogen) linking groups can be 
cleaved by Zn or Mg; 2-hydroxyethyloxy linking groups can be cleaved by oxidation (e.g., with Pb(OAc) 4 ). 
[0214] Preferred linkers are those that are cleaved by acid or photolysis. Several of the acid-labile linkers that have 
been developed for solid phase peptide synthesis are useful for linking tags to MOIs. Some of these linkers are 
described in a recent review by Lloyd- Williams et al. (Tetrahedron 49:11065-11 133. 1993). One useful type of linker is 
based upon p-alkoxybenzyl alcohols, of which two. 4-hydroxymethylphenoxyacetic acid and 4-(4-hydroxymethyl-3- 
methoxyphenoxy)butyric acid, are commercially available from Advanced ChemTech (Louisville, KY). Both linkers can 
be attached to a tag via an ester linkage to the benzylalcohol. and to an amine-containing MOI via an amide linkage to 
the carboxylic acid. Tags linked by these molecules are released from the MOI with varying concentrations of trifluoro- 
acetic acid. The cleavage of these linkers results in the liberation of a carboxylic acid on the tag. Acid cleavage of tags 
attached through related linkers, such as 2.4<limethc>xy-4 , -(carboxymethyloxy)43enzhydrylarnine (available from 
Advanced ChemTech in FMOC-protected form), results in liberation of a carboxylic amide on the released tag. 
[021 5] The photolabile linkers useful for this application have also been for the most part developed for solid phase 
peptide synthesis (see Lloyd-Williams review). These linkers are usually based on 2-nitrobenzylesters or 2-nitrobenzy- 
lamides. Two examples of photolabile linkers that have recently been reported in the literature are 4-(4-(1-Fmoc- 
amino)ethyl)-2-methoxy-5-nitrophenoxy)butanoic acid (Holmes and Jones. J. Org. Chem. 605318-2319. 1995) and 3- 
(Fmoc-amino)-3-(2-nitrophenyl)propionic acid (Brown et al.. Molecular Diversity 7:4-12, 1995). Both linkers can be 
attached via the carboxylic acid to an amine on the MOI. The attachment of the tag to the linker is made by forming an 
amide between a carboxylic acid on the tag and the amine on the linker. Cleavage of photolabile linkers is usuaJly per- 
formed with UV light of 350 nm wavelength at intensities and times known to those in the art Examples of commercial 
sources of instruments for photochemical cleavage are Aura Industries Inc. (Staten Island. NY) and Agrenetics (Wilm- 
ington. MA). Cleavage of the linkers results in liberation of a primary amide on the tag. Examples of photocleavable link- 
ers include nitrophenyl glycine esters, exo- and endo-2-benzonorbomeyl chlorides and methane sulfonates, and 3- 
amino-3(2-nitropheny») propionic acid. Examples of enzymatic cleavage include esterases which will cleave ester 
bonds, nucleases which will cleave phosphodiester bonds, proteases which cleave peptide bonds, etc. 

E. DETECTION OF TAGS 

[0216] Detection methods typically rely on the absorption and emission in some type of spectral field. When atoms 
or molecules absorb light, the incoming energy excites a quantized structure to a higher energy level. The type of exci- 
tation depends on the wavelength of the light. Electrons are promoted to higher orbitals by ultraviolet or visible light 
molecular vibrations are excited by infrared light, and rotations are excited by microwaves. An absorption spectrum is 
the absorption of light as a function of wavelength. The spectrum of an atom or molecule depends on its energy level 
structure. Absorption spectra are useful for identification of compounds. Specific absorption spectroscopic methods 
include atomic absorption spectroscopy (AA), infrared spectroscopy (IR), and UV-vis spectroscopy (uv-vis). 
[0217] Atoms or molecules that are excited to high energy levels can decay to lower levels by emitting radiation. 
This light emission is called fluorescence if the transition is between states of the same spin, and phosphorescence if 
the transition occurs between states of different spin. The emission intensity of an analyte is linearly proportional to con- 
centration (at low concentrations), and is useful for quantifying the emitting species. Specific emission spectroscopic 
methods include atomic emission spectroscopy (AES). atomic fluorescence spectroscopy (AFS), molecular laser- 
induced fluorescence (LI F), and X-ray fluorescence (XRF). 

[0218] When electromagnetic radiation passes through matter, most of the radiation continues in its original direc- 
tion but a small fraction is scattered in other directions. Ught that is scattered at the same wavelength as the incoming 
light is called Rayleigh scattering. Light that is scattered in transparent solids due to vibrations (phonons) is called 
Brillouin scattering. Brillouin scattering is typically shifted by 0.1 to 1 wave number from the incident light Ught that is 
scattered due to vibrations in molecules or optical phonons in opaque solids is called Raman scattering. Raman scat- 
tered light is shifted by as much as 4000 wavenumbers from the incident light. Specific scattering spectroscopic meth- 



42 



G GO 

EP0 992 511 A1 

ods include Raman spectroscopy. 

[021 9] IR spectroscopy is the measurement of the wavelength and intensity of the absorption of mid-infrared light 
by a sample. Mid-infrared light (2.5 - 50 ^im, 4000 - 200 cm* 1 ) is energetic enough to excite molecular vibrations to 
higher energy levels. The wavelength of IR absorption bands are characteristic of specific types of chemical bonds and 
IR spectroscopy is generally most useful for identification of organic and organometallic molecules. 
[02201 Near-infrared absorption spectroscopy (NIR) is the measurement of the wavelength and intensity of the 
absorption of near-infrared light by a sample. Near-infrared light spans the 800 nm - 2.5 urn (12,500 - 4000 cm' 1 ) range 
and is energetic enough to excite overtones and combinations of molecular vibrations to higher energy levels. NIR 
spectroscopy is typically used for quantitative measurement of organic functional groups, especially O-H, N-H, and 
C=0. The components and design of NIR instrumentation are similar to uv-vis absorption spectrometers. The light 
source is usually a tungsten lamp and the detector is usually a PbS solid-state detector. Sample holders can be glass 
or quartz and typical solvents are CCI 4 and CS 2 . The convenient instrumentation of NIR spectroscopy makes it suitable 
for on-line monitoring and process control. 

[0221] Ultraviolet and visible Absorption Spectroscopy (uv-vis) spectroscopy is the measurement of the wave- 
length and intensity of absorption of near-ultraviolet and visible light by a sample. Absorption in the vacuum UV occurs 
at 100-200 nm; (10 5 -50.000 cm' 1 ) quartz UV at 200-350 nm; (50,000-28.570 cm' 1 ) and visible at 350-800 nm; (28,570- 
12,500 cm* 1 ) and is described by the Beer-Lambert-Bouguet law. Ultraviolet and visible light are energetic enough to 
promote outer electrons to higher energy levels. UV-vis spectroscopy can be usually applied to molecules and inorganic 
ions or complexes in solution. The uv-vis spectra are limited by the broad features of the spectra. The light source is 
usually a hydrogen or deuterium lamp for uv measurements and a tungsten lamp for visible measurements. The wave- 
lengths of these continuous light sources are selected with a wavelength separator such as a prism or grating mono- 
chromator. Spectra are obtained by scanning the wavelength separator and quantitative measurements can be made 
from a spectrum or at a single wavelength. 

[0222] Mass spectrometers use the difference in the mass-to-charge ratio (m/z) of ionized atoms or molecules to 
separate them from each other. Mass spectrometry is therefore useful for quantitation of atoms or molecules and also 
for determining chemical and structural information about molecules. Molecules have distinctive fragmentation patterns 
that provide structural information to identify compounds. The general operations of a mass spectrometer are as fol- 
lows. Gas-phase ions are created, the ions are separated in space or time based on their mass-to-charge ratio, and the 
quantity of ions of each mass-to-charge ratio is measured. The ion separation power of a mass spectrometer is 
described by the resolution, which is defined as R = m / delta m , where m is the ion mass and delta m is the difference 
in mass between two resolvable peaks in a mass spectrum. For example, a mass spectrometer with a resolution of 
1 000 can resolve an ion with a m/z of 1 00.0 from an ion with a m/z of 1 00. 1 . 

[0223] In general, a mass spectrometer (MS) consists of an ion source, a mass-selective analyzer, and an ion 
detector. The magnetic-sector, quadrupole. and time-of-ffight designs also require extraction and acceleration ion optics 
to transfer ions from the source region into the mass analyzer. The details of several mass analyzer designs (for mag- 
netic-sector MS. quadrupole MS or time-of-flight MS) are discussed below. Single Focusing analyzers for magnetic- 
sector MS utilize a particle beam path of 180. 90. or 60 degrees. The various forces influencing the particle separate 
ions with different mass-to-charge ratios. With double-focusing analyzers, an electrostatic analyzer is added in this type 
of instrument to separate particles with difference in kinetic energies. 

[0224] A quadrupole mass filter for quadrupole MS consists of four metal rods arranged in paralleJ. The applied volt- 
ages affect the trajectory of ions traveling down the flight path centered between the four rods. For given DC and AC 
voltages, only ions of a certain mass-to-charge ratio pass through the quadrupole filter and all other ions are thrown out 
of their original path. A mass spectrum is obtained by monitoring the ions passing through the quadrupole filter as the 
voltages on the rods are varied. 

[0225] A time-of-flight mass spectrometer uses the differences in transit time through a "drift region" to separate 
ions of different masses. It operates in a pulsed mode so ions must be produced in pulses and/or extracted in pulses. 
A pulsed electric field accelerates all ions into a field-free drift region with a kinetic energy of qV. where q is the ion 
charge and V is the applied voltage. Since the ion kinetic energy is 0.5 mV 2 . lighter ions have a higher velocity than 
heavier ions and reach the detector at the end of the drift region sooner. The output of an ion detector is displayed on 
an oscilloscope as a function of time to produce the mass spectrum. 

[0226] The ion formation process is the starting point for mass spectrometry analyses. Chemical ionization is a 
method that employs a reagent ion to react with the analyte molecules (tags) to form ions by either a proton or hydride 
transfer. The reagent ions are produced by introducing a large excess of methane (relative to the tag) into an electron 
impact (El) ion source. Electron collisions produce CH 4 + and CH 3 + which further react with methane to form CH 5 * and 
C 2 H 5 + . Another method to ionize tags is by plasma and glow discharge. Plasma is a hot. partially-ionized gas that effec- 
tively excites and ionizes atoms. A glow discharge is a low-pressure plasma maintained between two electrodes. Elec- 
tron impact ionization employs an electron beam, usually generated from a tungsten filament, to ionize gas-phase 
atoms or molecules. An electron from the beam knocks an electron off analyte atoms or molecules to create ions. Elec- 



43 



D 2) 

EP 0 992 511 A1 

trospray ionization utilizes a very fine needle and a series of skimmers. A sample solution is sprayed into the source 
chamber to form droplets. The droplets carry charge when the exit the capillary and as the solvent vaporizes the drop- 
lets disappear leaving highly charged analyte molecules. ESI is particularly useful for large biological molecules that are 
difficult to vaporize or ionize. Fast-atom bombardment (FAB) utilizes a high-energy beam of neutral atoms, typically Xe 
or Ar. that strikes a solid sample causing desorption and ionization, ft is used for large biologicaJ molecules that are dif- 
ficult to get into the gas phase. FAB causes little fragmentation and usually gives a large molecular ion peak making it 
useful for molecular weight determination. The atomic beam is produced by accelerating ions from an ion source 
though a charge-exchange cell. The ions pick up an electron in collisions with neutral atoms to form a beam of high 
energy atoms. Laser ionization (UMS) is a method in which a laser pulse ablates material from the surface of a sample 
and creates a microplasma that ionizes some of the sample constituents. Matrix-assisted laser desorption ionization 
(MALDI) is a LIMS method of vaporizing and ionizing large biological molecules such as proteins or DNA fragments 
The biological molecules are dispersed in a solid matrix such as nicotinic acid. A UV laser pulse ablates the matrix 
which carries some of the large molecules into the gas phase in an ionized form so they can be extracted into a mass 
spectrometer. Plasma-desorption ionization (PD) utilizes the decay of 2S2 Cf which produces two fission fragments that 
travel in opposite directions. One fragment strikes the sample knocking out 1-10 analyte ions. The other fragment 
strikes a detector and triggers the start of data acquisition. This ionization method is especially useful for large biologi- 
cal molecules. Resonance ionization (RIMS) is a method in which one or more laser beams are tuned in resonance to 
transitions of a gas-phase atom or molecule to promote it in a stepwise fashion above its ionization potential to create 
an ion. Secondary ionization (SIMS) utilizes an ion beam; such as 3 He*. 16 0*. or ^Ar*; is focused onto the surface of 
a sample and sputters material into the gas phase. Spark source is a method which ionizes analytes in solid samples 
by pulsing an electric current across two electrodes. 

[0227J A tag may become charged prior to. during or after cleavage from the molecule to which it is attached Ioni- 
zation methods based on ion "desorption". the direct formation or emission of ions from solid or liquid surfaces have 
allowed increasing application to nonvolatile and thermally labile compounds. These methods eliminate the need for 
neutral molecule volatilization prior to ionization and generally minimize thermal degradation of the molecular species 
These methods include field desorption (Becky. Principles of Field Ionization and Field Desorption Mass Spectrometry 
Pergamon. Oxford. 1977). plasma desorption (Sundqvist and Macfarlane. Mass Spectrom. Rev 4 421 1985) laser 
desorption (Karasand Hillenkamp. Anal. Chem. 605299. 1988; Karas etal.. Angew. Chem. 101:805 1989) fast par- 
ticle bombardment (e.g.. fast atom bombardment. FAB. and secondary ion mass spectrometry SIMS Barber et al 
Anal. Chem. 54 :645A. 1982). and thermospray (TS) ionization (Vestal. Mass Spectrom. Rev. 2 447 1983) Thermos- 
pray is broadly applied for the on-line combination with liquid chromatography. The continuous flow FAB methods (Cap- 
rioti et al.. Anal. Chem. 582949. 1986) have also shown significant potential. A more complete listing of ionization/mass 
spectrometry combinations is ion-trap mass spectrometry, electrospray ionization mass spectrometry ion-spray mass 
spectrometry, liquid ionization mass spectrometry, atmospheric pressure ionization mass spectrometry electron ioniza- 
tion mass spectrometry, metastable atom bombardment ionization mass spectrometry, fast atom bombard ionization 
mass spectrometry. MALDI mass spectrometry. . photo-ionization time-of-flight mass spectrometry laser droplet mass 
spectrometry. MALDI-TOF mass spectrometry. APCI mass spectrometry, nano-spray mass spectrometry nebulised 
spray ionization mass spectrometry, chemical ionization mass spectrometry, resonance ionization mass spectrometry 
secondary ionization mass spectrometry, thermospray mass spectrometry. 

[0228] The ionization methods amenable to nonvolatile biological compounds have overlapping ranges of applica- 
bility. Ionization efficiencies are highly dependent on matrix composition and compound type. Currently available results 
indicate that the upper molecular mass for TS is about 8000 daltons (Jones and Krolik. Rapid Comm Mass Spectrom 
1 :67. 1 987). Since TS is practiced mainly with quadrapole mass spectrometers sensitivity typically suffers disporpor- 
tionately at higher mass-to-charge ratios <m/z). Time-of-flight (TOF) mass spectrometers are commercially available 
and possess the advantage that the m/z range is limited only by detector efficiency. Recently, two additional ionization 
methods have been introduced. These two methods are now referred to as matrix-assisted laser desorption (MALDI 
Karas and Hillenkamp. Anal. Chem. 60:2299. 1988; Karas etal.. Angew. Chem. 101:805. 1989) and electrospray ion- 
ization (ESI). Both methodologies have very high ionization efficiency (i.e.. very high [molecular ions producedl/Imole- 
cules consumed]). Sensitivity, which defines the ultimate potential of the technique, is dependent on sample size 
quantity of ions, flow rate, detection efficiency and actual ionization efficiency. 

[0229] Electrospray-MS is based on an idea first proposed in the 1960s (Dole et al.. J. Chem. Phys 49 2240 
1968). Electrospray ionization (ESI) is one means to produce charged molecules for analysis by mass spectroscopy! 
Br.ef ly. electrospray ionization produces highly charged droplets by nebulizing liquids in a strong electrostatic field The 
h.ghly charged droplets, generally formed in a dry bath gas at atmospheric pressure, shrink by evaporation of neutral 
solvent until the charge repulsion overcomes the cohesive forces leading to a "Coulombic explosion" The exact mech- 
anism of ionization is controversial and several groups have put forth hypotheses (Blades et al Anal Chem 632109 
14. 1991; Kebarle et al.. Anal. Chem. 65:A972-86. 1993; Fenn. J. Am. Soc. Mass. Spectrom. 4:524-35 1993) Regard- 
less of the ultimate process of ion formation. ESI produces charged molecules from solution under mild conditions 



44 



o • 00 

EP0 992 511 A1 

[0230] The ability to obtain useful mass spectral data on small amounts of an organic molecule relies on the effi- 
cient production of ions The efficiency of ionization for ESI is related to the extent of positive charge associated with 
the molecule. Improving ionization experimentally has usually involved using acidic conditions. Another method to 
improve ionization has been to use quaternary amines when possible (see Aebersold et a!., Protein Science 1 494-503 
1992; Smith et al., Anal. Chem. 50:436-41, 1988). 

[0231 ] Electrospray ionization is described in more detail as follows. Electrospray ion production requires two steps: 
dispersal of highly charged droplets at near atmospheric pressure, followed by conditions to induce evaporation. A solu- 
tion of analyte molecules is passed through a needle that is kept at high electric potential. At the end of the needle, the 
solution disperses into a mist of small highly charged droplets containing the analyte molecules. The small droplets 
evaporate quickly and by a process of field desorption or residual evaporation, protonated protein molecules are 
released into the gas phase. An electrospray is generally produced by application of a high electric field to a small flow 
of liquid (generally 1-10 uL/min) from a capillary tube. A potential difference of 3-6 kV is typically applied between the 
capillary and counter electrode located 0.2-2 cm away (where ions, charged clusters, and even charged droplets, 
depending on the extent of desolvation, may be sampled by the MS through a small orifice). The electric field results in 
charge accumulation on the liquid surface at the capillary terminus; thus the liquid flow rate, resistivity, and surface ten- 
sion are important factors in droplet production. The high electric field results in disruption of the liquid surface and for- 
mation of highly charged liquid droplets. Positively or negatively charged droplets can be produced depending upon the 
capillary bias. The negative ion mode requires the presence of an electron scavenger such as oxygen to inhibit electri- 
cal discharge. 

[0232] A wide range of liquids can be sprayed electrostatically into a vacuum, or with the aid of a nebulizing agent. 
The use of only electric fields for nebulization leads to some practical restrictions on the range of liquid conductivity and 
dielectric constant. Solution conductivity of less than 10 -5 ohms is required at room temperature for a stable electro- 
spray at useful liquid flow rates corresponding to an aqueous electrolyte solution of < 10" 4 M. In the mode found most 
useful for ESI-MS, an appropriate liquid flow rate results in dispersion of the liquid as a fine mist. A short distance from 
the capillary the droplet diameter is often quite uniform and on the order of 1 urn. Of particular importance is that the 
total electrospray ion current increases only slightly for higher liquid flow rates. There is evidence tat heating is useful 
for manipulating the electrospray. For example, slight heating allows aqueous solutions to be readily electrosprayed. 
presumably due to the decreased viscosity and surface tension. Both thermally-assisted and gas-nebulization-assisted 
electrosprays allow higher liquid flow rates to be used, but decrease the extent of droplet charging. The formation of 
molecular ions requires conditions effecting evaporation of the initial droplet population. This can be accomplished at 
higher pressures by a flow of dry gas at moderate temperatures (<60°C). by heating during transport through the inter- 
face, and (particularly in the case of ion trapping methods) by energetic collisions at relatively low pressure, 
[0233] Although the detailed processes underlying ESI remain uncertain, the very small droplets produced by ESI 
appear to allow almost any species carrying a net charge in solution to be transferred to the gas phase after evaporation 
of residual solvent. Mass spectrometry detection then requires that ions have a tractable m/z range (<4000 daltons for 
quadrupole instruments) after desolvation, as well as to be produced and transmitted with sufficient efficiency. The wide 
range of solutes already found to be amenable to ESI-MS. and the lack of substantial dependence of ionization effi- 
ciency upon molecular weight, suggest a highly non-discriminating and broadly applicable ionization process. 
[0234] The electrospray ion "source" functions at near atmospheric pressure. The electrospray "source" is typically 
a metal or glass capillary incorporating a method for electrically biasing the liquid solution relative to a counter elec- 
trode. Solutions, typically water-methanol mixtures containing the analyte and often other additives such as acetic acid, 
flow to the capillary terminus. An ESI source has been described (Smith et al., Anal. Chem 62:885, 1990) which can 
accommodate essentially any solvent system. Typical flow rates for ESI are 1-10 uUmin. The principal requirement of 
an ESI-MS interface is to sample and transport ions from the high pressure region into the MS as efficiently as possible. 
[0235] The efficiency of ESI can be very high, providing the basis for extremely sensitive measurements, which is 
useful for the invention described herein. Current instrumental performance can provide a total ion current at the detec- 
tor of about 2 x 1 0" 12 A or about 10 7 counts/s for singly charged species. On the basis of the instrumental performance 
concentrations of as low as 10°° M or about 10 18 mol/s of a singly charged species will give detectable ion current 
(about 10 counts/s) if the analyte is completely ionized. For example, low attomole detection limits have been obtained 
for quaternary ammonium ions using an ESI interface with capillary zone electrophoresis (Smith et al.. Anal. Chem. 
59:1230, 1988). For a compound of molecular weight of 1000, the average number of charges is 1. the approximate 
number of charge states is 1 , peak width (m/z) is 1 and the maximum intensity (ion/s) is 1 x 10 12 . 
[0236] Remarkably little sample is actually consumed in obtaining an ESI mass spectrum (Smith et al. Anal Chem 
60: 1 948. 1 988). Substantial gains might be also obtained by the use of array detectors with sector instruments, allowing 
simultaneous detection of portions of the spectrum. Since currently only about 10' 5 of all ions formed by ESI are 
detected, attention to the factors limiting instrument performance may provide a basis for improved sensitivity. It will be 
evident to those in the art that the present invention contemplates and accommodates for improvements in ionization 
and detection methodologies. 



45 



D D 

EP 0 992 511 A1 

[0237] An interface is preferably placed between the separation instrumentation (e.g.. gel)and the detector (eg 
mass spectrometer). The interface preferably has the following properties: (1) the ability to collect the DNA fragments 
at discreet time intervals. (2) concentrate the DNA fragments. (3) remove the DNA fragments from the electrophoresis 

H erS r?M ? IT • (4) dea r the 139 fr0m ,he DNA fra9ment - (5) seDarate the te 9 *»" DNA fragment. (6)dispose 
of the DNA fragment. (7) place the tag in a volatile solution. (8) volatilize and ionize the tag. and (9) place or transoort 

the tag to an electrospray device that introduces the tag into mass spectrometer. 

[02381 The interface also has the capability of "collecting" DNA fragments as they elute from the bottom of a ael 
The gel may be composed of a slab gel. a tubular gel. a capillary, etc. The DNA fragments can be collected by several 
methods. The first method is that of use of an electric field wherein DNA fragments are collected onto or near an elec- 
trode. A second method is that wherein the DNA fragments are collected by flowing a stream of liquid past the bottom 
of a gel. Aspects of both methods can be combined wherein DNA collected into a flowing stream which can be later 
concentrated by use of an electric field. The end result is that DNA fragments are removed from the milieu under which 
the separation method was performed. That is. DNA fragments can be "dragged" from one solution type to another bv 
use of an electric field. 7K ^ 

[0239J Once the DNA fragments are in the appropriate solution (compatible with electrospray and mass spectrom- 
etry) the tag can be cleaved from the DNA fragment. The DNA fragment (or remnants thereof) can then be separated 
from the tag by the application of an electric field (preferably, the tag is of opposite charge of that of the DNA tea) The 
tag is then introduced into the electrospray device by the use of an electric field or a flowing liquid 
[0240] Fluorescent tags can be identified and quantitated most directly by their absorption and fluorescence emis- 
sion wavelengths and intensities. 5 
[0241] While a conventional spectrofluorometer is extremely flexible, providing continuous ranges of excitation and 
em.ss.on wavelengths (l EX . I S1 . I^). more specialized instruments such as flow cytometers and laser-scanning micro- 
scopes require probes that are excitable at a single fixed wavelength. In contemporary instruments, this is usually the 
488-nm line of the argon laser. 7 

[0242] Fluorescence intensity per probe molecule is proportional to the product of e and QY The range of these 
parameters among fluorophores of current practical importance is approximately 10.000 to 100 000 cm-'M" 1 for c and 
VJ° 1 ° Wh !" ab / sorption is driven toward saturation by high-intensity illumination, the irreversible destruction 

° hlfS rted " uor <* hor « (Photobleaching) becomes the factor limiting fluorescence detectability. The practical impact 
of photobleachmg depends on the fluorescent detection technique in question 

J2SL J Wi "£ 6 Wid TL t0 0n !; in the ** 3 d6Vice (an intertace > ma * be imposed between the separation and 
detection steps to permit the continuous operation of size separation and tag detection (in real time). This unites the 
separation methodology and instrumentation with the detection methodology and instrumentation forming a single 
device. For example, an .nterface is interposed between a separation technique and detection by mass spectrometry 
or potent ostatic amperometry. ' 
[0244] The function of the interface is primarily the release of the (e.g., mass spectrometry) tag from analyte There 
are several representative implementations of the interface. The design of the interface is dependent on the choice of 

an Sf ?S?7L •? T 2 T ° r Ph0t0 ^ ,eavable linke ' s - an ««W or photon source is requiVeS JSwSS of 
an ac.d- labi e inker, a base-lab.le linker, or a disulfide linker, reagent addition is required within the interface In the case 
of heat-labile linkers, an energy heat source is required. Enzyme addition is required for an enzyme-sensitive linker 
such as a specrf.c protease and a peptide linker, a nuclease and a DNA or RNA linker, a glycosylase HRP or ohos- 
phatase and a linker which is unstable after cleavage (e.g.. similiar to chemiluminescent substrates) Other character- 
.st.cs of the interlace include minimal band broadening, separation of DNA from tags before injection into a mass 
spectrometer. Separation techniques include those based on electrophoretic methods and techniques affinity tech- 
niques, size retention (dialysis), filtration and the like. ' 
[0245] It is also possible to concentrate the tags (or nucleic acid-linker-tag construct), capture electrophoretically 
J alternate reagent Stream which is com Patib'e Ihe particular type of ionization method 
Sl^r n * CaPtUrinQ thC ta9S ( ° r nUC,eic ^id-linker-tag construct) on microbeads. 

shooting the bead(s) into chamber and then preforming laser desorptiorvVaporization. Also it is possible to extract in 
«ow into alternate buffer (e.g from capillary electrophoresis buffer into hydrophobic buffer across a permeable mem- 

comnri.! mSSTJ «1T IT T S t0 t39S int ° thS maSS IOmeter intermittently which would 

comprise a further function of the .nterface. Another function of the interface is to deliver tags from multiple columns into 

t^^T^T'^ 3 rrta *2 1 * me . Slot for each Als °- » is possible to deliver tags from a single column 

into multiple MS detectors, separated by time, collect each set of tags for a few milliseconds, and then deliver to a mass 
spectrometer. ^ J 

[0246] The following is a list of representative vendors for separation and detection technologies which may be 

m^ V£H Z^STT' "°!?L SCienti,iC ,nstruniente < San ^ncisco. CA) manufactures electrophoresis equip- 
ment (Two Step™. Poker Face™ II) for sequencing applications. Pharmacia Biotech (Piscataway NJ) manufactures 
electrophoresis equipment for DNA separations and sequencing (PhastSystem for PCR-SSCP analysis. MacroPhor 



46 



CO CO 

EP0 992 511 A1 



System for DNA sequencing). PerWn Elmer/Applied Biosystems Division (ABI. Foster City, CA) manufactures semi- 
automated sequencers based on fluorescent-dyes (ABI373 and ABI377). Analytical Spectral Devices (Boulder. CO) 
manufactures UV spectrometers. Hitachi Instruments (Tokyo, Japan) manufactures Atomic Absorption spectrometers. 
Fluorescence spectrometers, LC and GC Mass Spectrometers, NMR spectrometers, and UV-VIS Spectrometers. 
PerSeptive Biosystems (Framingham, MA) produces Mass Spectrometers (Voyager™ Elite). Bruker Instruments Inc. 
(Manning Park. MA) manufactures FTIR Spectrometers (Vector 22), FT-Raman Spectrometers, Time of Flight Mass 
Spectrometers (Reflex II™), Ion Trap Mass Spectrometer (Esquire™) and a MaWi Mass Spectrometer. Analytical Tech- 
nology Inc. (ATI, Boston. MA) makes Capillary Gel Electrophoresis units, UV detectors, and Diode Array Detectors. Tel- 
edyne Electronic Technologies (Mountain View, CA) manufactures an Ion Trap Mass Spectrometer (3DQ Discovery™ 
and the 3DQ Apogee™). Perkin Elmer/Applied Biosystems Division (Foster City. CA) manufactures a Sciex Mass Spec- 
trometer (triple quadrupole LC/MS/MS, the AP1 100/300) which is compatible with electrospray. Hewlett-Packard (Santa 
Clara, CA) produces Mass Selective Detectors (HP 5972A). MALDI-TOF Mass Spectrometers (HP G2025A), Diode 
Array Detectors. CE units, HPLC units (HP 1090) as well as UV Spectrometers. Finnigan Corporation (San Jose. CA) 
manufactures mass spectrometers (magnetic sector (MAT 95 S™), quadrapole spectrometers (MAT 95 SQ™) and four 
other related mass spectrometers). Rainin (Emeryville, CA) manufactures HPLC instruments. 

[0247] The methods and compositions described herein permit the use of cleaved tags to serve as maps to partic- 
ular sample type and nucleotide identity. At the beginning of each sequencing method, a particular (selected) primer is 
assigned a particular unique tag. The tags map to either a sample type, a dideoxy terminator type (in the case of a 
Sanger sequencing reaction) or preferably both. Specifically, the tag maps to a primer type which in turn maps to a vec- 
tor type which in turn maps to a sample identity. The tag may also may map to a dideoxy terminator type (ddTTP. ddCTP. 
ddGTR ddATP) by reference into which dideoxynucieotide reaction the tagged primer is placed. The sequencing reac- 
tion is then performed and the resulting fragments are sequentially separated by size in time. 

[0248] The tags are cleaved from the fragments in a temporal frame and measured and recorded in a temporal 
frame. The sequence is constructed by comparing the tag map to the temporal frame. That is. all tag identities are 
recorded in time after the sizing step and related become related to one another in a temporal frame. The sizing step 
separates the nucleic acid fragments by a one nucleotide increment and hence the related tag identities are separated 
by a one nucleotide increment By foreknowledge of the dideoxy-terminator or nucleotide map and sample type, the 
sequence is readily deduced in a linear fashion. 

[0249] The following examples are offered by way of illustration, and not by way of limitation. 
[0250] Unless otherwise stated, chemicals as used in the examples may be obtained from Aidrich Chemical Com- 
pany, Milwaukee, Wl. The following abbreviations, with the indicated meanings, are used herein: 

ANP = 3-(Fmoc-amino)-3-(2-nitrophenyl)propionic acid 
NBA = 4-(Fmoc-aminomethyl)-3-nitrobenzoic acid 

HATU = 0-7-azabenzotriazol-1-yl-N.N.N , ,N , -tetramethyluronium hexafluorophosphate 
DIEA = diisopropylethyl amine 
MCT = monochlorotriazine 
NMM = 4-methylmorpholine 
NMP = N-methylpyrrolidone 

ACT357 = ACT357 peptide synthesizer from Advanced ChemTech, Inc., Louisville. KY 
ACT = Advanced ChemTech, Inc., Louisville, KY 

NovaBiochem = CalBiochem-NovaBiochem International. San Diego, CA 

TFA = Trif luoroacetic acid 

Tfa = Trif luoroacetyl 

SNIP = N-Methylisonipecotic acid 

Tfp = Tetraf luorophenyl 

DIAEA = 2-(Diisopropylamino)ethylamine 

MCT «s monochlorotriazene 

5'-AH-ODN = 5*-aminohexyl-tailed oligodeoxy nucleotide 



47 



D 1 

EP 0 992 511 A1 

EXAMPLES 
EXAMPLE 1 

PREPARATION OF ACID LABILE LINKERS FOR USE IN CLE AVABLE-MW- IDENTIFIER SEQUENCING 

bo x^dL 5 g r P m e r ' UOrOPhenY ' FStP " ° f Chf,miCa " Y "™ 3^ to j g Tags m n, 

[0251] Rgure 1 shows the reaction scheme. 

S^ITSSt A f. r ! Sin (c ° mpound ,l: avai,able from ACT = 1 «■•) Is suspended with DMF in the collection ves- 
sel of the ACT357 peptide synthesizer (ACT). Compound I (3 eq.). HATU (3 eq.) and DIEA (7 5 eol inDMFam 

(2X)> 3,111 DMF (2X >- ^ cou P |,n 9 01 1 to «"e resin and the wash steps are repeated, to give compound III 

Hi 6 To^ 0mP ° Und 110 iS mix6d With 25% in DMF and shaken for 5 min. The resin is filtered 

from Synthetech. Albany. OR; 3 eq.). HATU (3 eq.). and DIEA (7.5 eq.) in DMF The vessel s shate^ TZ^l 
solvent is removed and the resin washed with NMP (2X). MeOH (2X) and DMF (2^ ^Ip ng of vTo therein" 
and the wash steps are repeated, to give compound V. 

SjepD. The resin (compound V) is treated with piperidine as described in step B to remove the FMOC oraun Th« 
deprotected resin is then divided equally by the ACT357 from the collection vessei I inTlS reacfion^ete 

St^E. The 16 aliquots of deprotected resin from step D are suspended in DMF. To each reaction vessel is added 
the appropriate ■ carboxyl.c aad \t\„ 6 (R^^H; 3 eq.). HATU (3 eq.). and DIEA (7.5 eq] fn DMF^ e ve22 

Si TSTJZL inn J7. S ° TL" T T aml ^ a ' iqUOtS ° f reSi " With NMP < 2 ^ MeOH (2X1 an??MF 

(2X). The couplmg of Vl,. 16 to the al.quots of resin and the wash steps are repeated, to give compounds' ™ ,, 6 

TFAr^HcTtn? r P ° Und ? V ^ 6) V ***** "* h (3X) " T ° 6300 ° f * e reac «° n v «se.s is 

a f* T ' n T P H2 ?' 2 3 " d * e vessels 8hal « n for 3° ™- ^ solvent is filtered from the reaction vessels into 

h»T,H h^?k -^ ,qU °? ^ reS ' n ate W3Sh6d With CH 2 C, 2 < 2X > and Me OH (2X) and the filtrates wnSfned n£ 
the .nd.vdual tubes. The md.vdual tubes are evaporated in vacuo, providing compounds VIII,." C ° mDmed ' nt0 

S^HK aCh "V? free <; arboxy,ic 3Cids V, "i-16 is dissolved in DMF. To each solution is added pyridine (1 05 ea ) 
followed by pentafluorophenyl tnf.uoroacetate (1.1 eq ). The mixtures are stirred for 45 min. at ^m tomperaS' 
^aTrr^ wrtn *°Ac. washed with 1 M aq. citric acid (3X) and 5% aq. NaHC0 3 ££2£?SE 
Na 2 S0 4 . filtered, and evaporated in vacuo, providing compounds IX,.^. 

bo SttiH tl^ '" 0 ^^^ FStPrS * ChftmiCa " Y C,e3V ^ gP^roscnpy Tag, to I Jberat* ^ r T . 

[0252] Figure 2 shows the reaction scheme. 

SteaA. 4-(Hydroxymethyl)phenoxybutyric acid (compound I; 1 eq.) is combined with DIEA (2 i et, * a „H k 
mide (2.1 eq.) in CHC 3 and heated to reflux for 2 hr. The mixtures diluted with eSc wash J 2 f N HQ ™ 

co^ou^r ^ ^ ^ ^ N * 8Q * 3nd ~P»— <" « ^v^ JSJSR 

SimB. The allyl ester of compound I from step A (1 .75 eq.) is combined in CH 2 CI 2 with an FMOC-orotected amino 
aad conta,n,ng am,ne fcnctionality in its side chain (compound ... e.g. a.pha-N-FMOC-3-(3^yrid y T) a S avLT 
able from Synthetech. Albany. OR; 1 eq.). N-methylmorpholine (2.5 eq.). and HATU (1 1 eqTand S ?*t 
temperature for 4 hr. The mixture is diluted wrth CH 2 C, washed with 1 M ag. citric acid wafer (Vxf and °S 



48 



a.) oo 

EP0 992 511 A1 

aq. NaHC0 3 (2X), dried over Na 2 S0 4 , and evaporated in vacuo. Compound III is isolated by flash chromatography 

(CH 2 CI 2 -» EtOAc). 

Step C . Compound HI is dissolved in CH 2 CI 2 . Pd(PPh 3 ) 4 (0.07 eq.) and N-methylaniiine (2 eq.) are added, and the 
mixture stirred at room temperature for 4 hr. The mixture is cBluted with CH 2 CI 2 . washed with 1 M aq. citric acid (2X) 
and water (1X). dried over Na 2 S0 4 , and evaporated in vacuo. Compound IV is isolated by flash chromatography 

(CH 2 CI 2 -> EtOAc + HOAc). 

Step D. TentaGel S AC resin (compound V; 1 eq.) is suspended with DMF in the collection vessel of the ACT357 
peptide synthesizer (Advanced ChemTech Inc. (ACT), Louisville, KY). Compound IV (3 eq.), HATU (3 eq.) and 
DIEA (7.5 eq.) in DMF are added and the collection vessel shaken for 1 hr. The solvent is removed and the resin 
washed with NMP (2X). MeOH (2X), and DMF (2X). The coupling of IV to the resin and the wash steps are 
repeated, to give compound VI. 

Step E. The resin (compound VI) is mixed with 25% piperidine in DMF and shaken for 5 min. The resin is filtered, 
then mixed with 25% piperidine in DMF and shaken for 10 min. The solvent is removed and the resin washed with 
NMP (2X). MeOH (2X), and DMF (2X). The deprotected resin is then divided equally by the ACT357 from the col- 
lection vessel into 16 reaction vessels. 

Step F, The 16 aliquots of deprotected resin from step E are suspended in DMF. To each reaction vessel is added 
the appropriate carboxylic acid Vll^e (R-MeCOgH; 3 eq.), HATU (3 eq.). and DIEA (7.5 eq.) in DMF. The vessels 
are shaken for 1 hr. The solvent is removed and the aliquots of resin washed with NMP (2X), MeOH (2X). and DMF 
(2X). The coupling of Vll^g to the aliquots of resin and the wash steps are repeated, to give compounds Vtll-|. 16 . 

Step G. The aliquots of resin (compounds Vlll 1 _ 1fi ) are washed with CH2CI2 (3X). To each of the reaction vessels 
is added 1% TFA in CH2CI2 and the vessels shaken for 30 min. The solvent is filtered from the reaction vessels into 
individual tubes. The aliquots of resin are washed with CH 2 CI 2 (2X) and MeOH (2X) and the filtrates combined into 
the individual tubes. The individual tubes are evaporated in vacuo, providing compounds IX-,. 16 . 

Step H. Each of the free carboxylic acids is dissolved in DMF. To each solution is added pyricfine (1 .05 eq ). 
followed by pentafluorophenyl trifluoroacetate (1.1 eq.). The mixtures are stirred for 45 min. at room temperature' 
The solutions are diluted with EtOAc. washed with 1 M aq. citric acid (3X) and 5% aq. NaHC0 3 (3X), dried over 
Na 2 S0 4 , filtered, and evaporated in vacuo, providing compounds X^g. 

EXAMPLE 2 

DEMONSTRATION OF PHOTOLYTIC CLEAVAGE OF T-L-X 

[0253] A T-L-X compound as prepared in Example 13 was irradiated with near-UV light for 7 mm at room tempera- 
ture. A Rayonett fluorescence UV lamp (Southern New England Ultraviolet Co.. Middletown, CT) with an emission peak 
at 350 nm is used as a source of UV light. The lamp is placed at a 15-cm distance from the Petri dishes with samples. 
SDS gel electrophoresis shows that >85% of the conjugate is cleaved under these conditions. 

EXAMPLE 3 

PREPARATION OF FLUORESCENT LABELED PRIMERS AND DEMONSTRATION OF CLEAVAGE OF FLUORO- 
PHORE 

Synthesis and Purification of Oligonucleotides 

[0254] The oligonucleotides (ODNs) are prepared on automated DNA synthesizers using the standard phospho- 
ranuldite chemistry supplied by the vendor, or the H-phosphonate chemistry (Glenn Research Sterling. VA). Appropri- 
ately blocked dA. dG. dC. and T phosphoramidites are commercially available in these forms, and synthetic nucleosides 
may readily be converted to the appropriate form. The oligonucleotides are prepared using the standard phosphora- 
mid'rte supplied by the vendor, or the H-phosphonate chemistry. Oligonucleotides are purified by adaptations of stand- 
ard methods. Oligonucleotides with S'-trityl groups are chromatographed on HPLC using a 12 micrometer. 300 # Rainin 



49 



0 



EP 0 992 511 A1 



. 3 • P H 7 0. «> v er 20 m.n. When detrrtylaton 1S performed, the oligonucleotides are further purified bv ael exdu 

I025S ! « / reparation of 2.4.6-trichiorotriazine derived oligonucleotides: 10 to 1000 ng of 5'-terminal amine linked on 
gonucleotide are reacted with an excess recrystallized cyanuric chloride in 10% n-metJy^^n™ £ZE£J 8 3 
to 8.5 preferably) buffer at 19°C to 25°C for 30 to 120 minutes. The final reaction conditions consist of 0 MS M Sdinm 
borate at pH 8.3. 2 mg/m. recrystallized cyanuric chlorfcfe and 500 ug/ml respective oligonudeoSelhe unrea^cya 
ESS? ^ 18 rem0V6d by 8,26 exdusion Aortography on a G-50 Seohadex (PhaLcia Pisca cSumn 
[0256] The activated purified oligonucleotide is then reacted with a 100-foW molar excels oTcSn^n 0 iTii 
sod,um borate at pH 8.3 for 1 hour at room temperature. The unreacted cystamJne is removS tl^ZTjlI" 

derived ODN preparation «s divded into 3 portions and each portion is reacted with either (a) 20*ShmSI££J^ 

r S " lfo "^ hlor ' de V**"* p r*es. Eugene. OR), with (b) 20-foW molar excess o Limine 

r de (Molecular Probes. Eugene. OR), (c) 20-foU molar excess of fluorescein isothiocyanate ThV^iSScSS 

tioos consist of 0.15 M sodium borate at pH 8.3 for 1 hour at room temperature. T^e unreal 

removed by size exclusion chromatography on a G-50 Sephadex column "uorocnromes are 

minutes at room temperature. Fluorescence is measured in a black microliter nlate T*nl!*Zl- "Abated for 15 
incubation tubes (150 microliters) and P ,aced in a b.ack microti^ £££55 VtSZZ (SST 
plates are then read directly using a Fluoroskan I. fluorometer (Row Labora^ies STean vli 2?2 1^^° 
wavelength of 495 nm and monitoring emission at 520 nm for fluorescein, using Tn ^SdSSlJS^oS^SS 



Moles of Fluorochrome 


RRJ non-cleaved 


RFU cleaved 


RFU free 


1.0x10 5 M 


6.4 


1200 


1345 


3.3 x 10 6 M 


2.4 


451 


456 


1.1 x 10 6 M 


0.9 


135 


1.30 


3.7x10 7 M 


0.3 


44 


48 


1.2 x 10 7 M 


0.12 


15.3 


16.0 


4.1 x 10 7 M 


0.14 


4.9 


5.1 


1.4x10 8 M 


0.13 


2.5 


2.8 


4.5x10 9 M 


0.12 


0.8 


0.9 



Si from h ;eSDN ndiCa,e *~ * *"* 3 inCr6aSe h "^scence when the fluorochrome is 

EXAMPLE 4 

PREPARATION OF TAGGED M13 SEQUENCE PRIMERS AND DEMONSTRATION OF CLEAVAGE OF TAGS 
2s £?2ttJ£St ^ ^ ' inked *— 

[0260] The activated purified oligonucleotide is then reacted with a 100-fold molar excess of cystamine in 0.15 M 



50 



O-' o J 

EP0 992 511 A1 

allowed to polymerize for at least 30 minutes. Prior to loading, the tape around the bottom of the gel and the well-form- 
ing comb is removed. A vertical electrophoresis apparatus is then assembled by clamping the upper and lower buffer 
chambers to the gel plates, and adding 1 X MTBE electrophoresis buffer to the chambers. Sample wells are flushed with 
a syringe containing running buffer, and immediately prior to loading each sample, the well is flushed with running buffer 
using gel loading tips to remove urea. One to two microliters of sample is loaded into each well using a Pipetteman 
(Rainin. Emeryville. CA) with gel-loading tips, and then electrophoresed according the following guidelines (during elec- 
trophoresis, the gel is cooled with a fan): 





termination reaction polyacryla- 
mide gel 


electrophoresis conditions 


short 

long 

long 


5%. 0.15 mm x 50 cm x 20 cm 
4%. 0. 15 mm x 70 cm x 20 cm 
4%. 0. 15 mm x 70 cm x 20 cm 


2.25 hours at 22 mA 
8-9 hours at 15 mA 
20-24 hours at 15 mA 



[0307] Each base-specific sequencing reaction terminated (with the short termination) mix is loaded onto a 0.15 
mm x 50 cm x 20 cm denaturing 5% polyacrylamide gel; reactions terminated with the long termination mix typically are 
divided in half and loaded onto two 0.15 mm x 70 cm x 20 cm denaturing 4% polyacrylamide gels. 
[0308] After electrophoresis, buffer is removed from the wells, the tape is removed, and the gel plates separated. 
The gel is transferred to a 40 cm x 20 cm sheet of 3MM Whatman paper, covered with plastic wrap, and dried on a 
Hoefer (San Francisco. CA) gel dryer for 25 minutes at 80°C. The dried gel is exposed to Kodak (New Haven, CT) XRP- 
1 film. Depending on the intensity of the signal and whether the radiolabei is ^P or 35 S. exposure times vary from 4 
hours to several days. Af ier exposure, films are developed by processing in developer and fixer solutions, rinsed with 
water, and air dried. Trie autoradiogram is then placed on a light-box. the sequence is manually read, and the data 
typed into a computer. 

[0309] Taq-polymerase catalyzed cycle sequencing using labeled primers. Each base-specific cycle sequencing 
reaction routinely included approximately 100 or 200 ng isolated single-stranded DNA for A and C or G and T reactions, 
respectively. Double- stranded cycle sequencing reactions similarly contained approximately 200 or 400 ng of plasmid 
DNA isolated using either the standard alkaline lysis or the diatomaceous earth-modified alkaline lysis procedures. All 
reagents except template DNA are added in one pipetting step from a premix of previously aliquoted stock solutions 
stored at -20°C. Reaction premixes are prepared by combining reaction buffer with the base-specific nucleotide mixes. 
Prior to use. the base-specific reaction premixes are thawed and combined with diluted Taq DNA polymerase and the 
individual end-labeled universal primers to yield the final reaction mixes. Once the above mixes are prepared, four aliq- 
uots of single or double-stranded DNA are pipetted into the bottom of each 0.2 ml thin-walled reaction tube, correspond- 
ing to the A. C. G, and T reactions, and then an aliquot of the respective reaction mixes is added to the side of each 
tube. These tubes are part of a 96-tube/retainer set tray in a microtiter plate format, which fits into a Perkin Elmer Cetus 
Cycler 9600 (Foster City, CA). Strip caps are sealed onto the tube/retainer set and the plate is centrifuged briefly. The 
plate then is placed in the cycler whose heat block had been preheated to 95°C. and the cycling program immediately 
started. The cycling protocol consists of 15-30 cycles of: 95°C denaturation; 55°C annealing; 72°C extension; 95°C 
denaturation; 72°C extension, 95°C denaturation, and 72°C extension, linked to a 4°C final soak file. 
[0310] At this stage, the reactions may be frozen and stored at -20°C for up to several days. Prior to pooling and 
precipitation, the plate is centrifuged briefly to reclaim condensation. The primer and base-specific reactions are pooled 
into ethand. and the precipitated DNA is collected by centrifugation and dried. These sequencing reactions could be 
stored for several days at -20°C. 

[0311] The protocol for the sequencing reactions is as follows. For A and C reactions. 1 jui. and for G and T reac- 
tions, 2 ix\ of each DNA sample (100 ng/ul for M13 templates and 200 ng/ul for pUC templates) are pipetted into the 
bottom of the 0.2 ml thin-walled reaction tubes. AmpliTaq polymerase (N801-0060) is from Perkin-Elmer Cetus (Foster 
City, CA). 

[0312] A mix of 30 ul AmpliTaq (5U/ul). 30 pi 5X Taq reaction buffer. 130 M l ddH20. and 190 M l diluted Taq tor 24 
clones is prepared. 

[0313] A. C. G. and T base specific mixes are prepared by adding base-specific primer and diluted Taq to each of 
the base specific nucleotide/buffer premixes: 

A,C/G,T 

60/120 p\ 5X Taq cycle sequencing mix 



63 



i- 



D o 

EP 0 992 511 A1 

30/60 pi diluted Taq polymerase 

30/60 ul respective fluorescent end-labeled primer 
120/240 ul 

B. Taq-polymerase catalyzed cycle sequencer, iisinn MW-identH^.i^ ied termini reag^ns 

trnled^m^ 
--yobserv^t^^^^ 

Zf? ' S ^ Ch T SenSrt ' Ve 10 concentration than that obtained with the labeled primerTeSns asSbtd" 

^e^ 

the ? w«, tenp, ate iso^tion ar* 96 w.l react^r"^ * 
[0315] Place 0.5 ug of single-stranded or 1 ug of double-stranded DNA in n ? mi Prn «,.k»- a-^ . 
stranded templates) or 4 M . (for double-stranded templates)* 0 8^ ^met ^ 9 TJSaX^LZS Sin9 ' e 
tube, and bring the final volume to 20 „. with ddH^e mriLg briefly EJ££I 'u^aTutfnL^l^r ° "* 
"^S?? ? 1,16 manu fac — <'•*•• reheat at 96*C followed by 2 i cycS o^C t 

ond. 60«C for 4 m,nutes and then link to a 4°C hoki). Proceed wrth the Zn column purificatton^nt J!L thtcerrS" 
Sep columns (Armcon. Beverly. MA) or G-50 microliter plate procedures given below 

c - Terminator Reaction Cl ean-Up via Centri-Sa p r.nh .mr.«t 

Allow me ge, te hydS, to, ^^Zl^'^^^Z T"* *~ " 

for 2 minutes to remove the fluid. Remove ^^^0^ wa^Tbe l^"?^ ^^rifuge at 1300 xg 
Carefully remove the reaction mixture (20 & aS oaS on top oTSe ae^ maSrT »1 J« 3 

cycling instrument that required overlaying wth oil ^efuHv ^ZTlttT^ t mcubated in * 

oil with the sample, although small amounts "ToT^ * Avoid "P 

tip containing the sample can be removed by touc^no the tta cmI ^ I .Iffl °" 5 ^ « P ' pet 
each column only once. Spin in a variable* "» 
same orientation as it was in for the first spin Dry the samole inVZ,l r!l» n P 9 COlumn ,n * e 
desired, reactions can be precipitated with ethanol P °° n0t heat or over ** lf 

D. Terminator Reaction Clean-Up via St-nh^y n -50 Filled Mirmtjter Formal Filter p„,q. 

Place microliter fitter p.a?e on top of a micmLvSat to coC wtter and tn^H T T * P ' at6 - 
trifugation. Spin at 1500 rpm for 2 minutes [SrV w^ 

microtiter filter plate on top of a microtrter 0ate E£2 Z^artX^^^"?* 7 T** P,aCe *" 
gation. Add an addttona, 100-200 of Sephadex SsTto^L ^otte^ 

to^^ 

[0318] Sequmas.™ (UBS. Clevelarf OH) catalyzed sequencing with labeled terminator 2i„„i.„„„h_. 
oator reasons require approximate!, 2 M9 ol phenol extracted Misiased ^.fcZTa SStSS.!?""- 

r Is ,na f r 2,7s r * i"i^s»s:'nri 

are precipitated and dried. To aid in the removal of unincoroorated terminainrc .ho hma „ .I . fragments 
anol. The dried sequencing reactions could be stored Tf^^ ^T^C ° * ^ ' S ^ ^ eX * 
[031 9] Double-stranded terminator reactions required approximately 5 M g of diatomaceous earth modified-alka.ine 



64 



EP0 992 511 A1 



lysis midi-prep purified plasmid DNA. The double-stranded DNA is denatured by incubating the DNA in sodium hydrox- 
ide at 65°C. and after incubation, primer is added and the reaction is neutralized by adding an acid-buffer. Reaction 
buffer, alpha-thio-deoxynucieotides, labeled dye-terminators, and diluted Sequenase TM DNA polymerase then are 
added and the reaction is incubated at 37°C. Ammonium acetate is added to stop the reaction and the DNA fragments 
are precipitated, rinsed, dried, and stored. 
[0320] For Single-stranded reactions: 
Add the following to a 1.5 ml microcentrifuge tube: 

4 uJ ss DNA (2 ug) 

4 uJ 0.8 pM primer 

2uJ lOx MOPS buffer 

2 mI 1 0x Mn 2 Visocitrate buffer 

12 jil 

[0321] To denature the DNA and anneal the primer, incubate the reaction at 65°C-70°C for 5 minutes. Allow the 
reaction to cool at room temperature for 15 minutes, and then briefly centrifuge to reclaim condensation To each reac- 
tion, add the following reagents and incubate for 10 minutes at 37°C. 

7 mI ABI terminator mix (Catalogue No. 401 489) 

2 pi diluted Sequenase TM (3.25 U/^il) 
1 uJ 2 mM <x-S dNTPs 

22 til 

[0322] The undiluted Sequenase TM (Catalogue No. 70775. United States Biochemicals, Cleveland OH) is 13 U/ul 
and is diluted 1 :4 with USB dilution buffer prior to use. Add 20 p\ 9.5 M ammonium acetate and 100 til 95% ethanol to 
stop the reaction and mix. 

[0323] Precipitate the DNA in an ice-water bath for 10 minutes. Centrifuge for 15 minutes at 10 000 xg in a micro- 
centrifuge at 4°C. Carefully decant the supernatant and rinse the pellet by adding 300 ul of 70-80% ethanol Mix and 
centrifuge again for 1 5 minutes and carefully decant the supernatant. 

[0324] Repeat the rinse step to insure efficient removal of the unincorporated terminators Dry the DNA for 5-10 

minutes (or until dry) in the Speed-Vac, and store the dried reactions at -20°C. 

[0325] For double-stranded reactions: 

Add the following to a 1 .5 ml microcentrifuge tube: 

5 ul ds DNA (5 pg) 
4uJ 1 N NaOH 

3 pi ddH 2 0 

[0326] Incubate the reaction at 65*C-70°C for 5 minutes, and then briefly centrifuge to reclaim condensation Add 
the following reagents to each reaction, vortex, and briefly centrifuge: 

3 uJ 8 jiM primer 
9 uJ ddH 2 0 

4 uJ MOPS-Acid buffer 

[0327] To each reaction, add the following reagents and incubate for 1 0 minutes at 37°C. 

4 uJ 1 0X Mn 2+ /isocitrate buffer 

6 pi ABI terminator mi 

2 pi diluted Sequenase TM (3.25 U/jil) 
1^1 2 mM [alpha]-S-dNTPs 
22 ul 

[0328] The undiluted SEQUENASE™ from United States Biochemicals is 13 U/p\ and should be diluted 1 4 with 
USB dilution buffer prior to use. Add 60 ul 8 M ammonium acetate and 300 uJ 95% ethanol to stop the reaction and vor- 
tex. Precipitate the DNA in an ice-water bath for 1 0 minutes. Centrifuge for 1 5 minutes at 1 0,000 xg in a microcentrifuge 
at 4°C. Carefully decant the supernatant, and rinse the pellet by adding 300 ul of 80% ethanol. Mix the sample and cen- 
trifuge again for 15 minutes, and carefully decant the supernatant Repeat the rinse step to insure efficient removal of 



65 



EP 0 992 511 A1 

the unincorporated terminators. Dry the DNA for 5-10 minutes (or until dry) in the Speed-Vac. 

A B ?^^^ e NA e ^?^ 0n ^ Pre ; B '^ horesis - ^ electrophoresis data rnll^o n and m M 

[0329] Polyacrylamide gels for DNA sequencing are prepared as described above, except that the gel mix is filtered 
pnor to polymer.zat.on. Glass plates are carefully cleaned with hot water, distilled water, and ethanol to ?en^e po S 
luorescem contom,nan1s pr.or to taping. Denaturing 6% polyacrylamide gels are poured into 0.3 mm x 8^,^52 cm 

Z^TZ J "ST? 3 36 .T" ^ Aft6r e°* meriza *> n - » e tape and the*™* are removed llZ gefarS 
the outer surfaces of the glass plates are cleaned with hot water, and rinsed with distilled water and ethanol The gel 
assembled into an ABI sequencer, and the checked by laser-scanning. If baseline alterations are obse3on the A& 

l^ZZ* is 2TS£"T ,he ? ,es are redeaned - ™» ™* e^: 

^ J? 1 1 9 ,S P^ectrophoresed for 10-30 minutes at 30 W. Prior to sample loading the 
P °°J?L a ? dned reactl on P roducts are suspended in formamide/EDTA loading buffer by vortexing^nd then heaS 
at 90'C. A samp.e sheet ,s created within the AB. data collection software on the Macintosh computer whichTr^iS S 
£ number of samples loaded and the fluorescent-labeled mobility file to use for sequenceTrprocS n AfS 
c.ean.ng the samp.e wells w.th a synnge. the odd-numbered sequencing reactions are loaded into the rSpSve welte 
us.ng a m.crop.pettor equipped w.th a ftat-tipped gel-toading tip. The gel is then electrophoresed for 5 minuses be2 t 
he wells are cleaned aga.n and the even numbered samples are loaded. The filter wheel used for dye^ertJnd £ e 
terminators .s specrf.ed on the ABI 373A CPU. Typically electrophoresis and data collection are for roTourrLTow^ 
the ABK373A »at is fitted with a heat^istributing aluminum plate. After data coHectior^imlge f ie SUSSES 
AB. software that relates the fluorescent signa, detected to the corresponding scan number. T^e soZZe tVn deter! 
m.nes the samp.e .ane portions based on the signa. intensities. After, the lanes are tracked, the cro^etlton of date 
for each lane are extracted and processed by baseline subtraction, mobility calculation spectral oecor^Ston arS 
>me correction. After processing, me sequence data files are transferred to a SPARCstation s^ng NF^nte 
03301 Protocol: prepare 8 M urea. 4.75% polyacrylamide ge.s. as described above, using cS£ Prior to 

SS^tlT SU T the i , 9e ' P ' ateS - ASS6mb,e the 961 P,ates into an ABI 373A 9 DNA Su^r poster 
Crty. CA) so that the lower scan (usually the blue) line corresponds to an intensity value of 800-1 OOO^soLved i 
the computer data collection window. H the baseline of four-color scan lines is not Nat, redean tile g^ssSate? mSZ 
a um.num heat distribution plate. Pre-electrophorese the gel for 1 0-30 minutes. Prepare the 'sarnies to ^oaSno >Si?3 
£ of FE to the bottom of each tube, vortex, heat at 90°C for 3 minutes, and centring to fecSconSenSon Rush 
the sampU, wells wrth electrophoresis buffer using a syringe. Using flat-tipped gel loading pfoette^toaTeach oS 
numbered sample. Pre-electrophorese the gel for at least 5 minutes, flush the wells again , and then ^ioad eaS evS 
numbered sample. Begin the electrophoresis (30 W for 10 hours). After data collection the ABTsSarTwira^om^ 

F. Double-stranded sequencing of cDNA clones cnntaininn Innn p n> y (M tai | s , isin n scored nolvfri-n p rim^ 

[0331] Double-stranded templates of cDNAs containing long poly(A) tracts are difficult to sequence with vector 
pnmers wh.ch anneal downstream of the po.y(A) tail. Sequencing with these primers results in a lon^ pSymTa^deT fol 
lowed by a sequence wh.ch may be diff icuH to read. To circumvent this problem three primers whS co^mt? a nH 
ether (dA) or (dC) or (dG) at the 3" end were designed to anchor" the primers a^^^^^J^^ 
d-atety upstream of the po.y(A) region. Using this protocol, over 300 bp of readable seqSnce coufo be oWaTeHhe 
sequence of the opposrte strand of these cDNAs was determined using insert-specif ic primers uo^ream oTthP noh^ 
region. The abi.ity to directly obtain sequence immediate.y upstream from the po^A) tail oT cSn^SoI be 
^T^* t0 ' ar9e SCa,e eff ° r,S to 9enerate s«»"ence-tagged sites (STSs) from cONAs ^ 
inTnMA 2 Pr0t0C0 ' V* fo "T S - S * ntnesize anchored poly (dT) 17 with anchors of (dA) or (dC) or (dG) at the 3' end 
on a DNA synthes.zer and use after purification on Oligonucleotide Purification Cartridges (Amicon SS MA) For 
sequencing w.th anchored pnmers. denature 5- M g of plasmid DNA in a total volume of 50 i containing J I sodium 
hydrox.de and 0.16 mM EDTA by incubation at 65'C for 10 minutes. Add the three po.y(dT^ Tan^o^ primers (2^md 
tfeach) and .mmediately p.ace the mixture on ice. Neutra.ize the solution by adding I ml of 5 M am^onTum f icelto. 

[0333] Preciprtate the DNA by adding 150 pi of cold 95% ethanol and wash the pellet twice with cold 70% ethanol 

utes a? SS?£ 5 ^r 5 , 3 ™ 1 resus P end in MOp S buffer. Annea. the primers by heating the so Z2££SZ. 
utes at 65°C followed by slow cooling to room temperature for is-^n mint**e d^^JL ^ u * mm 

"»"■« " DNA po^se an, .^WIP (^^^^^^^^T^ — 



66 



O : G • - 

EP0 992 511 A1 

G. cDNA sequencing based on PCR and random shotgun cloning 

[0334] The following is a method for sequencing cloned cDNAs based on PCR amplification, random shotgun clon- 
ing, and automated fluorescent sequencing. This PCR-based approach uses a primer pair between the usual "univer- 
sal" forward and reverse priming sites and the multiple cloning sites of the Stratagene Bluescript vector. These two PCR 
primers, with the sequence 5'-TCGAGGTCGACGGTATCG-3' (Seq. ID No. 15) for the forward or -16bs primer and 5*- 
GC CGCTCTAG AACTAG TG-3' (Seq. ID No. 16) for the reverse or +19bs primer, may be used to amplify sufficient quan- 
tities of cDNA inserts in the 1 .2 to 3.4 kb size range so that the random shotgun sequencing approach described below 
could be implemented. 

[0335] The following is the protocol. Incubate four 100 *il PCR reactions, each containing approximately 100 ng of 
plasmid DNA. 100 pmoles of each primer. 50 mM KCI. 10 mM Tris-HCI pH 8.5. 1 .5 mM MgCI 2 . 0.2 mM of each dNTP, 
and 5 units of PE-Cetus Amplitaq in 0.5 ml snap cap tubes for 25 cycles of 95°C tor 1 minute, 55°C for 1 minute and 
72°C for 2 minutes in a PE-Cetus 48 tube DNA Thermal Cycler. After pooling the four reactions, the aqueous solution 
containing the PCR product is placed in an nebulizer, brought to 2.0 ml by adding approximately 0.5 to 1.0 ml of glyc- 
erol, and equilibrated at -20°C by placing it in either an isopropyl alcohol/dry ice or saturated aqueous NaCI/dry ice bath 
for 1 0 minutes. The sample is nebulized at -20°C by applying 25 - 30 psi nitrogen pressure for 2.5 min. Following ethanol 
precipitation to concentrate the sheared PCR product, the fragments were blunt ended and phosphorylated by incuba- 
tion with the Klenow fragment of E. coll DNA polymerase and T4 polynucleotide kinase as described previously. Frag- 
ments in the 0.4 to 0.7 kb range were obtained by elution from a low melting agarose gel. 

[0336] From the foregoing, it will be appreciated that, although specific embodiments of the invention have been 
described herein for purposes of illustration, various modifications may be made without deviating from the spirit and 
scope of the invention. Accordingly, the invention is not limited except as by the appended claims. 



67 



D 2) 

EP 0 992 511 A1 



SEQUENCE USTTNr; 

(1) GENERAL INFORMATION: 

(i) APPLICANTS: Van Ness. Jeffrey 
Tabone. John C. 
Howbert. J. Jeffry 
Mulligan. John T. . 

(ii) TITLE OF INVENTION: METHODS AND COMPOSITIONS FOR DETERMINING 
THE SEQUENCE OF NUCLEIC ACID MOLECULES 

(iii) NUMBER CF SEQUENCES: 16 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY 

(B) STREET: 6300 Columbia Center. 701 Fifth Avenue 

(C) CITY: Seattle 

. (D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP: 98104-7092 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) ' SOFTWARE: Patentln Release #1 . 0. Version #1.30 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
(8) FILING DATE: 22-JAN-1997 
(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: McMasters. David D 

(B) REGISTRATION NUMBER : 33.963 

(C) REFERENCE/DOCKET NUMBER : 240052.416 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (206) 622-4900 
(8) TELEFAX: (206) 682-6031 



(2) INFORMATION FOR SEQ ID NO:i: 

(il SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 



68 



EP 0 992 511 A1 



(D) TOPOLOGY. It near 



(xi) SEQUENCE DESCRIPTION. SEO ID NO 1 : 
TGTAAAACGA CGGCCAGT 
(2) INFORMATION FOR SEO ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY, linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2. 
TGTAAAACGA CGGCCAGTA 
(2) INFORMATION FOR SEQ ID N0:3: 

(D SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 
TGTAAAACGA CGGCCAGTAT 
(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



69 



D 

EP 0 992 511 A1 



(xi) SEQUENCE DESCRIPTION: SEO ID N0:4: 
TGTAAAACGA CGGCCAGTAT G 
(2) INFORMATION FOR SEQ 10 N0:5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH. 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEOUENCE DESCRIPTION: SEQ ID NO: 5: 
TGTAAAACGA CGGCCAGTAT GC 
(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TGTAAAACGA CGGCCAGTAT GCA 
(2) INFORMATION FOR SEO ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



70 



EP0 992 511 A1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TGTAAMCGA CGGCCAGTAT GCAT 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH. 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 
TGTAAMCGA CGGCCAGTAT GCATG 
(2) INFORMATION FOR SEQ ID NO:9: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:9: 
TGTAAMCGA CGGCCACG 
(2) INFORMATION FOR SEQ ID NO: 10: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 



71 



.D 

EP 0 992 511 A1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TGTAAAACGA CGGCCAGCG 

(2) INFORMATION FOR SEQ ID NO: 11. 

(1) SEQUENCE CHARACTERISTICS - 
(A) LENGTH: 20 base pairs 
(6) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
TGTAAAACGA CGGCCAGCGT 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TGTAAAACGA CGGCCAGCGT A 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY, linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



72 



EP 0 992 511 A1 



TGTAAAACGA CGGCCAGCGT AC 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEOUENCE CHARACTER I ST ICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) 5TRANDEDNESS. single 
(0) TOPOLOGY: linear 



(xi) SEOUENCE DESCRIPTION: SEQ ID N0.14: 
TGTAAAACGA CGGCCAGCGT ACC 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY, linear 



(Xil SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TCGAGGTCGA CGGTATCG 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEOUENCE CHARACTERISTICS 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 
GCCGCTCTAG AACTAGTG 



73 



■5 3 

EP0 992 511 A1 

Claims 

1- A method comprising: 

(a) providing DNA fragments, each fragment having cleavably attached thereto a masstao- 
b separating the tagged fragments on the basis of fragment charge. ZE or shaoe 

(c) placng a charge on the tag and cleaving the tag from the fraoment wh^* r2 1 ? ' _ 
prior to. during, or after cieaving the tag from the fragmTnl Tart ^ ' S ^ °" the t39 

(d) determining each tag by mass spectrometry. 

2. A method tor determining the sequence of a nucleic acid molecule, comprising: 

20 3 - A method for determining the sequence of a nucleic acid molecule, comprising: 

(a) generating tagged nucleic acid fragments which are complementarv to a ^la^n . , ■ 

nSST ' a9S * SDeC " 0mett ' <"*»*■—* ana M. * e sequera of fh. nucleic 



4. A compound of the formula T^-L-X wherein, 



X is a functional group selected from hydroxyl, amino thiol carbowiir *r\A h^i^^ii^ • . 

which either activate or inhibit the acjy ofle group SSSSK S<SSSiS ^ 

40 5. A compound of the formula T ms -L-X wherein, 

cZtS SST* 5Pea,0n,e " y ' — - ff^f ctoes not con*,*. a 

X is a nucleic acid; 

where the compound is not bound to a solid support 

55 7. A composition comprising first and second compounds of the formula T™-L hv -X wherein. 

£ is a chemical group which upon exposure to light of selected wavelength, allows cleavaoe of T™ from x- 
r- . an organic group wh.ch ,s detectable by mass spectrometry and comprises a iSSZLl ^mpTem: 



45 



50 



74 



O .) O-) 

EP 0 992 511 A1 



and 

X is nucleic acid; 

with the proviso that the variable mass component in the first compound has a mass that is not identical to the 
mass of the variable mass component of the second compound, but the V ns groups in the first and second 
compounds are otherwise identical. 

8. A compound according to any one of claims 4 or 5 wherein L is L ho , or a composition comprising compounds 
according to claim 7, wherein L ho has the formula L 1 -L 2 -L 3 and -L 2 -L 3 has the formula: 



d 




with one carbon atom at positions a, b, c, d or e being substituted with -L 3 -X and optionally one or more of 
positions b. c, d or e being substituted with alkyl, alkoxy, fluoride, chloride, hydroxyl. carboxylate or amide- and R 1 
is hydrogen or hydrocarbyl. 

9. A compound or composition according to claim 8 wherein -L 3 -X is located at position a. 

10. A compound, or composition comprising compounds, according to any one of claims 4-9 where the compound 
comprises the formula: 

T 4 
I 

Amide 



O (CH 2 ) C 

ii o 



wherein 



G is (CHzh. 6 wherein a hydrogen on one and only one of the CH 2 groups of each G is replaced with-(CH,) - 
Amide-T'; z/e 

T 2 and J 4 are organic moieties of the formula C 1 . 25 No. 9 0 0 . 9 So. 3 P< w H B F„l 8 wherein the sum of a, 0 and 8 is 
sufficient to satisfy the otherwise unsatisfied valencies of the C, N, O. S and P atoms; 
Amide is 



o o 
II II 

— N-C or — C-N — ; 

1 1 1 i 

R 1 R 1 



75 



EP0 992 511 A1 



R 1 is hydrogen or C 1o0 alkyl; 

c is an integer ranging from 0 to 4; and 

entiy^JS fan9in9 ^ 1 ,0 50 *** *™ Whe " " * flreatar **" 1< °' C " An * te - R ' and ^ are Wepend- 
S me formuT Siti0n COmPriSin9 COmp ° UndS - aCCOrdin 9 to an * one - <**™ where the compound 



T 4 
I 

Amide 
I 

O ^ H 2)c R l Q 




L' 



R« ° (CH 2 ) C 
Amkle 



is suffilnTto J,i^ * °J- l h L formu,a Ci.2sNo. 9 O a9 So-3Po. 3 H a F p l fi wherein the sum of a. 0 and S 

■ suftaent to satisfy the otherw.se unsat.sf.ed valencies* the C. N, O. S and P atoms: and T 5 includes a tertiarv 
or quaternary amine or an organic acid; and m is an integer ranging from 0-49. ^ 

12 " cX^^etorm^ OSit ° n COmPriSin9 ^ aCCOrdin9 to any one 01 c,aims »"« the compound 




is suffirnTro^ 3 ! 0 ^^ m ° iety ^ formU,a C i-«No-90o- 9 S 0 . 3 P t ,3H a F |1 l s wherein the sum of a. p. and 5 
-s sufteent to sat.sfy the otherw.se unsat.sf.ed valencies of the C. N. O, S and P atoms; and I* includes a tertiarv 
or quaternary am.ne or an organic acid; and m is an integer ranging from 0-49. '"Quaes a ternary 

1 3. A composition comprising a plurality of compounds according to any one of claims 4 or 5. or a composition of claim 
7. where.n no two compounds in the composition have either the same T"» or the same X. 

14. A compound according to claim 5. or a composrtion according to claim 7. wherein X is a DNA sequencing primer. 

1 5. A compound of the formula CT^VL-X wherein. 

IZ iS 3n ° rganiC 9r ° UP dete f 315,6 by 171355 "P** 0 "*"* comprising carbon, at least one of hydrogen and flu- 

or.de. and opfonal atoms selected from oxygen, nitrogen, sulfur, phosphorus and iodine- 

L .s an organic group which allows a ^-containing moiety to be cleaved from the remainder of the compound 



76 



oo oo 

EP0 992 511 A1 

wherein the T ms -containing moiety comprises a functional group which supports a single ionized charge state 
when the compound is subjected to mass spectrometry and is tertiary amine, quaternary amine or organic 
acid; 

X is a functional group selected from hydroxyl, amino, thiol, carboxylic acid, haloalkyl, and derivatives thereof 
which either activate or inhibit the activity of the group toward coupling with other moieties; and 
n is a number of T ms groups, where n is an integer greater than one. 

1 6. A compound of the formula fP^-L-X wherein, 

T™ is an organic group detectable by mass spectrometry, with the proviso that T" 18 does not comprise reporter 
groups; 

L is an organic group which allows a T ms -containing moiety to be cleaved from the remainder of the compound, 
wherein the T™«containing moiety comprises a functional group which supports a single ionized charge state 
when the compound is subjected to mass spectrometry and is tertiary amine, quaternary amine or organic 
acid; 

X is a nucleic acid; and 

n is a number of T™ groups, where n is an integer greater than one. 

17. A compound of the formula T ms -L-X wherein, 

T™ 8 is an organic group detectable by mass spectrometry, with the proviso that V ns does not comprise reporter 
groups; 

L is an organic group which allows a T^-containing moiety to be cleaved from the remainder of the compound, 
wherein the T^-containing moiety comprises a functional group which supports a single ionized charge state 
when the compound is subjected to mass spectrometry and is tertiary amine, quaternary amine or organic 
acid; and 

X is a molecule of interest (MOI) selected from proteins, peptides, antibodies or antibody fragments, receptors 
receptor ligands. members of a ligand pair, cytokines, hormones, oligosaccharides, synthetic organic mole^ 
cules, and drugs. 

18. A compound according to claims 4. 5. 15, 16 or 1 7 wherein the tag is non-volatile and thermally labile. 



77 



EP 0 992 511 A1 



FIGURE- 1 




78 



EP0 992 511 A1 



FIGURE 2 




79 



EP 0 992 511 A1 




FIGURE 3 



80 



EP0 992 511 A1 




EP 0 992 511 A1 




FIGURE 5 



82 



EP 0 992 511 A1 




Oivid© into 
36 readers 




FIGURE 6 



83 



3 



EP 0 992 511 A1 



3 




FIGURE 7 



84 



CP 



EP0 992 511 A1 



NovaSyn HMP Ranm 
HATU.P&03 



FmacttH _ 



STEP B 



STEPC 




MO, 



STgPg 




• wo, 



XT'! ~ 



r MM 




STEP Q 

o 

PhSiHj 




FmocriN 




.NOj 




NO? 



STEP G 



HA TV MfcS0 



VII 




38 l o ao ora IX i -,3 9 

2. P ipandg i Q 



Os^OM 



SJEPJ 




STEP I 



*ioa -COOM 

MATU. NMM 



fY N02 A 




HN V^0 NMTfa 



*1 -23 



FIGURE 8 



85 



EP 0 992 511 A1 




FIGURE 9 



86 



EP 0 992 511 A1 




BASE 




y -Aminche*y Waited oJtgonoitectkJos 

100 mM Sodium borate. pH 8.3 
Cyanuric cftlonde 



CjJ ^H N ,CH 2 )^l^ q BASH 



OligonudootidQ , 

38 



*!!.-:> 



Step B 



100 mM Sodium borate, pH 8.3 



CI 




HN-(CH 2 )g-.0- , p~ 



Oligonucleotide ,.53 



33* ' " O 



FIGURE 10 



87 




88 



(V 



EP 0 992 511 A1 



2 




89 



D D 

EP0 992 511 A1 



Variable Weight Mass Spec Photocieavabfe 
W ei gfat Range Sensirvuy r inker 

Component Adjuster Enhancer 



^WRA 

NH 




S'-Amtnohsxyl Tailed 
Oligonucleotide 



FIGURE 13 



90 



EP 0 992 511 A1 




European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Number 

EP 99 11 3790 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation of document with indication, where appropriate. 
of relevant passages 



WO 95 04160 A (ISIS INNOVATION ; SOUTHERN 
EDWIN (GB); CUMMINS WILLIAM JONATHAN (G8) 
9 February 1995 (1995-02-09) 

* the whole document * 

W0 94 16101 A (K0ESTER HUBERT) 
21 July 1994 (1994-07-21) 

* cl aims 1-73 * 

W0 94 21822 A (K0ESTER HUBERT) 
29 September 1994 (1994-09-29) 

* claims 1-55; figures 1-8 * 

JACOBSON K B ET AL: "APPLICATIONS OF MASS 
SPECTROMETRY TO DNA SEQUENCING" 
GENETIC ANALYSIS TECHNIQUES AND 
APPLICATIONS, US, ELSEVIER SCIENCE 
PUBLISHING, NEW YORK, 

vol. 8, no. 8, page 223-229 XP000271820 
ISSN: 1050-3862 

* the whole document * 

WO 95 28640 A (UNIV COLUMBIA ;C0LD SPRING 
HARBOR LAB (US); STILL W CLARK (US); WI) 
26 October 1995 (1995-10-26) 

* abstract; claims 1,9,38,39 * 

US 5 118 605 A ( URDEA MICHAEL S) 
2 June 1992 (1992-06-02) 

* the whole document * 

WO 95 14108 A (SCHWARZ TEREK ;AMERSHAM INT 
PLC (GB); HOWE ROLAND PAUL (GB); REEVE) 
26 May 1995 (1995-05-26) 

* abstract; claims 1-29 * 

-/-- 



The present search report has been drawn up tor all claims 

Place ol searcn 



THE HAGUE 



Retevant 
to claim 



1-18 



1-18 



1-18 



1-18 



1-18 



1-18 



1-18 



1-18 



ol tf* searcft 



21 January 2000 



CLASSIFICATION OF THE 
APPLICATION (tnt.Cl.7) 



C07H21/00 
C1201/68 



TECHNICAL FIELDS 
SEARCHED <tnt.Cl.7> 



C07H 

C12Q 



Examrnn 

Scott, J 



CATEGORY OF CITED DOCUMENTS 

X : particularly relevant if taken alone 

V particular^ retevant if combined with another 

document oflhe same category 
A : technological background 
O : non-written disclosure 
P : intermediate document 



T : theory or principle underlying the invention 
E : earlier patent document, but published on or 

after me tiling date 
O document cited in the application 
L : document cited for other reasons 

A : rnemoer of the same patent family, correspondino 
document ^ 



91 



EP0 992 511 A1 



J) 



European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Number 

EP 99 11 3790 



a 

3 



o 

2 



DOCUMENTS CONSIDERED TO BE RELEV ANT 

Citation ot document with indication, where appropriate 
. of relevant passages 



B 8 BROWN ET AL: "A Single-Bead Decode 
Strategy Using Electrospray Ionization 
Mass Spectometry and a New Photolabile 
Linker: 3-Amino-3-(2-nitrophenyl )P 
roplonlc Acid" 

MOLECULAR OIVERSITr .NL.ESCOM SCIENCE 
PUBLISHERS, LEIDEN, 

^J;., 1 '™ l ' P a 9 e 4-12 XP002094265 
ISSN: 1381-1991 

* the whole document, but especially the 

new photolabile linker : y 

3-amino-3-(2-nitrophenyi ) propionic add 

WO 95 25737 A (PENN STATE RES FOUND 

,i?run?)[i C ,f EPHEN J (US) ; WINOGRAD 
NICHOLAS () 28 September 1995 (1995-09-281 
+ the whole document * " 



Relevant 
to claim 



1-18 



CLASSIFICATION OF THE 
APPLICATION (l«t.CI.7) 



1-18 



TECHNICAL FIELDS 
SEARCHED (lnt.Cl.7) 



The present search report has been drawn up for all claims 

Data of compteoon o» the searcn 



Place ol sea ten 



THE HAGUE 



CATEGORY OF CITED DOCUMENTS 

X : particularly relevant if taken alone 

Y : particularly relevant rf combined with another 

document ot the same category 
A : tocnno logical background 
O : non-wnnen disclosure 
P intermediate document 



21 January 2000 



Scott, J 



T theory or pnnciple underlying the invention 
: ear,,er P at ent document, out published on or 
after the filing date 
D : document cited in the application 
L : document crted for other reasons 

& : oocZment 31 **™ ^ ^ e ^cncting 



92 



EP0 992 511 A1 



ANNEX TO THE EUROPEAN SEARCH REPORT 

ON EUROPEAN PATENT APPLICATION NO. EP 99 11 3790 



^HHOH** me pa,en .' ,am,,y mem '» r sfe)ating to the patent documents cited in the above-mentioned Eurooean searrh 

i he members are as contained in the Eurooean Patent Office EDP tile on "wrmunea turopean search report. 

The European Patent Office is in no way liable tor these particulars which are merely giver, for the purpose of information. 

21-01-2000 



Patent document 
Cited m search report 



Publication 
date 



Patent family 
member(s) 



W0 9504160 


A 




AT 
A 1 


159767 T 








A f 1 

AU 


695349 B 








A 1 1 

AU 


7269194 A 








CA 


2168010 A 








CN 


1131440 A 








DE 


69406544 D 








DE 


69406544 T 








DK 


711362 T 








Ir 


0711362 A 








EP 


0778280 A 








ES 


2108479 T 








C T 
PI 


960403 A 








HU 


73802 A 








JP 


9501830 T 








NO 


960370 A 








US 


5770367 A 


WO 9416101 


A 




At 1 
AU 


09494Q B 








All 


oyyzyy4 a 








All 
nU 


Q 1 "J70QQ A 








f A 


CIDOJO/ A 








FP 

l r 


fifi 7Q1 OK A 








JP 










US 










us 


5605798 A 








1 1 c 

us 


5691141 A 


W0 9421822 


A 


29-09-1994 


AU 


687801 6 








AU 


6411694 A 








CA 


2158642 A 








EP 


0689610 A 








JP 


8507926 T 








US 


5622824 A 








US 


5872003 A 








us 


5851765 A 


WO 9528640 


A 


26-10-1995 


us 


5565324 A 








AU 


2292695 A 








CA 


2187792 A 








CN 


1151793 A 








EP 


0755514 A 








HU 


74985 A 








JP 


10502614 T 








NO 


964332 A 








US 


5968736 A 








US 


5789172 A 



Publication 
date 



15-11- 
13-08- 

28- 02- 
09-02- 
18-09- 
04-12- 
26-02- 

22- 12- 

15- 05- 
11-06- 

16- 12- 

29- 01- 

30- 09- 
25-02- 
28-03- 

23- 06- 



06-08- 
15-08- 
14-01- 

21- 07- 
02-11- 

22- 10- 
20-08- 
25-02- 
25-11- 



1997 
1998 
-1995 
-1995 
-1996 
-1997 
-1998 
-1997 
-1996 
-1997 
-1997 
1996 
-1996 
-1997 
1996 
•1998 

1998 
1994 
1999 
1994 
1995 
1996 
1996 
1997 
1997 



05-03-1998 
11-10-1994 
29-09-1994 
03-01-1996 
27-08-1996 
22-04-1997 
16-02-1999 
22-12-1998 



15-10- 

10- 11- 
26-10- 

11- 06- 
29-01- 
28-03- 
10-03- 

03- 12- 
19-10- 

04- 08- 



1996 
1995 
1995 
1997 
1997 
1997 
1998 
1996 
1999 
1998 



For more details about this annex : see Official Journal ol the European Patent Office. No. 12'82 



93 



EP0 992 511 A1 



ANNEX TO THE EUROPEAN SEARCH REPORT 
ON EUROPEAN PATENT APPLICATION NO. 



EP 99 11 3790 



TpPeSrr;^^^^ — «- - *~««- European sear* .per, 

The European Pa.en, «- ,s „ no way Iiatte ,or ,nese par*^ ^ are mere, ^n ,or *, purpoS e C .n-orr^on. 

21-01-2000 



Patent document 
cited in search report 

us MiPfini; a 


Publication 
date 


Patent family 
member(s) 


Publication 
date 



W0 9514108 A 26-05-1995 



W0 9525737 A 28-09-1995 



US 


4775619 A 


AT 


133714 T 


AT 


168724 T 


DE 


3854969 0 


DE 


3854969 T 


DE 




OE 


3856224 T 


EP 


0360940 A 


EP 


0703296 A 


ES 


2083955 T 


JP 


2092300 A 


JP 


2676535 B 


US 


5258506 A 


US 


5545730 A 


US 


5578717 A 


US 


5552538 A 


US 


5430136 A 


US 


5367066 A 


US 


5380833 A 


EP 


0765401 A 


JP 


9505397 T 


US 


5849542 A 






EP 


0751950 A 


JP 


9510711 T 


US 


5834195 A 



15-02-1996 
15-08-1998 
14-03-1996 
30-05-1996 
27-08-1998 

03- 12-1998 

04- 04-1990 
27-03-1996 

01- 05-1996 
03-04-1990 
17-11-1997 

02- 11-1993 
13-08-1996 

26- 11-1996 

03- 09-1996 

04- 07-1995 
22-11-1994 
10-01-1995 

02-04-1997 

27- 05-1997 
15-12-1998 

08-01-1997 

28- 10-1997 
10-11-1998 



2 Z 

«. For more details about this annex : see Official Journal of the European Patent Office, No. 12/82 



94 



